E-CAM partners at Aalto University (CECAM Finish Node) in collaboration with the HPC training experts from the CSC Supercomputing Centre, are organizing a joint Extended Software Development Workshop from 15-19 October 2019 , aimed at people interested in particle based methods, such as the Discrete Element and Lattice Boltzmann Methods, and on their massive parallelization using GPU architectures. The workshop will mix three different ingredients: (1) workshop on state-of-the-art challenges in computational science and software, (2) CSC -run school, and (3) coding sessions with the aid of CSC facilities and expertise.
Dr. Jony Castagna, Science and Technology Facilities Council, United Kingdom
Jony Castagna recounts his transition from industry scientist to research software developer at the STFC, his E-CAM rewrite of DL_MESO allowing the simulation of billion atom systems on thousands of GPGPUs, and his latest role as Nvidia ambassador focused on machine learning.
Jony, can you tell us how you came to work on the E-CAM project and what you were doing before?
My background is in Computational Fluid Dynamic (CFD), and I worked for many years in London as a computational scientist for an Oil & Gas industry. I joined STFC – Hartree Centre in 2016 and E-CAM was my first project. E-CAM offered an opportunity to work in a new and more academic and fundamental research environment.
What is your role in E-CAM?
My role is as research software developer, which consists mainly in supporting the E-CAM Postdoctoral Researchers in developing their software modules, benchmarking available codes and contribute to the deliverables of the several work-packages. This includes the work described here, in co-designing DL_MESO to run on GPUs.
What is DL_MESO and why was it important to port it to massively parallel computing platforms?
DL_MESO is a software package for mesoscale simulations developed by M. Seaton at the Hartree Centre [1, 2]. It is basically made of two software components: a Lattice Boltzmann method solver, which uses the Lattice Boltzmann equation discretize on a lattice (2D or 3D) to simulate the fluid dynamic effects of complex multiphase systems; and a Dissipative Particle Dynamics (DPD) solver based on particle method where a soft potential, together with a coupled dissipation and stochastic forces, allows the use of Molecular Dynamics but with a larger time step.
The need to port DL_MESO to massively parallel computing platforms arose because often real systems are made of millions of beads (each bead representing a group of molecules) and small clusters are usually not sufficient to obtain results in brief time. Moreover, with the advent of hybrid architectures, updating the code is becoming an important software engineering step to allow scientist to continue their work on such systems.
How well were you able to improve the scaling performance of DL_MESO with multiple GPGPU’s, and as a consequence, how large a system can you now treat?
The current multi-GPU version of DL_MESO scales with an 85% efficiency up to 2048 GPUs equivalent to about 10 petaflops of performance double precision (see Fig. 1 reproduced from E-CAM Deliverable 7.6). This allows the simulation of very large systems like a phase mixture with 1.8 billion particles (Fig. 2). The performance has been obtained using the PRACE resource Piz Daint supercomputer from CSCS.
What are the sorts of practical problems that motivated these developments, and what is the interest from industry (in particular IBM and Unilever) ?
DPD has the intrinsic capability to conserve hydrodynamic behavior, which means it reproduces fluid dynamic effects when a large number of beads is used. The use of massively parallel computing allows the simulation of complex phenomena like shear banding in surfactants and ternary systems present in many personal care, nutrition, and hygiene products. DL_MESO has been used intensively by IBM Research UK and Unilever and there is a long history of collaboration with Hartree Centre still going on.
Are there some examples of the power of DL_MESO to simulate continuum problems with difficult boundary conditions, etc., where standard continuum approaches fail?
Yes. One good example is the polymer melt simulation. Realistic polymers typically are notoriously very large macromolecules, and their modeling in industrial manufacturing processes, where fluid dynamic effects like extrusion exist, is a very challenging task. Traditional CFD solvers fail to describe well the complex interface and interactions between polymers. DPD represents the ideal approach for such systems.
What were the particular challenges to porting DL_MESO to GPUs? You started by an implementation on a single GPU and only afterwards ported it to multi-GPUs. Was that needed?
The main challenge has been to adapt the numerical algorithm implemented in the serial version to the multithread GPU architecture. This required mainly a reorganization of the memory layout to guarantee coalescent access and take advantage of the extreme parallelism provided by the accelerator. The single GPU version was developed first, optimized and then extended to multi-GPU capability based on MPI library and a typical domain decomposition approach.
We know you are adding functionalities to the GPU version of DL_MESO, such as electrostatics and bond forces. Why is that important?
Electrostatic forces are very common in real systems, they allow the simulation of complex products where charges are distributed across the beads creating polarization effects like those in a molecule of water. However, these are long-range interactions and special methods like Ewald Summation and Smooth Particle Ewald Mesh are needed to fully compute their effects. They represent a challenge from numerical implementation due to their high computational cost and difficulties they present to parallelization.
Where can the reader find documentation about the software developments that you have been doing in DL_MESO?
Mainly on the E-CAM modules dedicated to DL_MESO that have been reported on Deliverables 4.4 and 7.6, and also on the E-CAM software repository here .
Did your work with E-CAM, on the porting of DL_MESO to GPUs, opened doors to you in some sense?
Yes. IBM Research UK has shown interest in the multi-GPU version of the code for their studies on multiphase systems and Formeric, a spin-off company of STFC, is planning to use it as the back end of their products for mesoscale simulations.
Recently, you have also been nominated as an NVidia Ambassador. How did that happen?
We have a regular collaboration with NVidia, not only through the Nvidia Deep Learning Institute (DLI) for dissemination and tutorials, but also for optimization in porting software to multi-GPU as well as Deep Learning applications applied mainly to computer vision industrial problems. This is how I got the Nvidia DLI Ambassador status in October 2018. It is being a great experience and an exciting opportunity.
What would you like to do next?
The Nvidia Ambassador experience in Deep Learning opened a new exciting opportunity in the so-called Naive Science: the idea is to use neural networks for replacing traditional computational science solvers. A Neural Network can be trained using real or simulated data and then used to predict new properties of molecules or fluid dynamic behaviour in different systems. This will speed up the simulation by a couple of orders of magnitude as well as avoiding complex modeling based on the use of ad hoc parameters that are often difficult to determine.
The CECAM CALL for workshops and schools that will run from April 2020 to March 2021 is now open! The text for the call and information on how to submit a proposal can be found at https://www.cecam.org/submitting/.
MatrixSwitch is a module which acts as an intermediary interface layer between high-level and low-level routines dealing with matrix storage and manipulation. It allows a seamlessly switch between different software implementations of the matrix operations.
DBCSR is an optimized library to deal with sparse matrices, which appear frequently in many kind of numerical simulations.
In DBCSR@MatrixSwitch, DBCSR capabilities have been added to MatrixSwitch as an optional library dependency.
To carry out calculations in serial mode may be too slow sometimes and a parallelisation strategy is needed. Serial/parallel MatrixSwitch employs Lapack/ScaLapack to perform matrix operations, irrespective of their dense or sparse character. The disadvantage of the Lapack/ScaLapack schemes is that they are not optimised for sparse matrices. DBCSR provides the necessary algorithms to solve this problem and in addition is specially suited to work in parallel.
The E-CAM Scoping Workshop “Building the bridge between theories and software: SME as a boost for technology transfer in industrial simulative pipelines”, organised in May 2018 at the Fondazione Instituto Italiano di Tecnologia (IIT), Genoa, brought together top-level scientists of the E-CAM community with expertise in statistical mechanics, multi-scale modeling and electronic structure, and representatives of pharmaceutical and material industries, with the final objectives to identify the major gaps which still hamper a systematic exploitation of accurate computer simulations in industrial R&D. Special attention was given to the role of SMEs devoted to simulative software development, and several software vendor SMEs were present at the meeting.
It was clear from the meeting that software vendor SMEs may represent the missing link in the pipeline from-theory-to-software; as they can play an increasingly key role not only in translating the science developed in academia into a proper technological transfer process, but also in building a scientific bridge between the industry requirements in terms of automation and the new theories and algorithms developed at an academic level. There was also a consensus that EU funded Centers of Excellence for Computing Applications, such as E-CAM, can provide an opportunity to enhance the expertise and scope of software vendors SMEs.
The Grand Canonical Adaptive resolution scheme (GC-AdResS) gives a methodological description to partition a simulation box into different regions with different degrees of accuracy. For more details on the theory see Refs. [1,2,3].
The current implementation of GC- AdResS in GROMACS has several performance problems. We know that the main performance loss of AdResS simulations in GROMACS is in the neighbouring list search and the generic serial force calculation linking the atomistic (AT) and coarse grained (CG) forces together via a smooth weighting function. Thus, to remove the bottleneck with respect to performance and a hindrance regarding the easy/general implementation into other codes and eliminate the non optimized force calculation, we had to change the neighbourlist search. This lead to a considerable speed up of the code. Furthermore it decouples the method directly from the core of any MD code, which does not hinder the performance and makes the scheme hardware independent.
This module presents a very straight forward way to implement a new partitioning scheme in GROMACS . And this solves two problems which affect the performance, the neighborlist search and the generic force kernel.
Information about module purpose, background information, software installation, testing and a link to the source code, can be found in our E-CAM software Libraryhere.
E-CAM Deliverables D4.3 and D4.4 present more modules developed in the context of this pilot project.
 L. Delle Site and M. Praprotnik, “Molecular Systems with Open Boundaries: Theory and Simulation,” Phys. Rep., vol. 693, pp. 1–56, 2017
 H.Wang, C. Schütte, and L.Delle Site, “Adaptive Resolution Simulation (AdResS): A Smooth Thermodynamic and Structural Transition fromAtomistic to Coarse Grained Resolution and Vice Versa in a Grand Canonical Fashion,” J. Chem. Theory Comput., vol. 8, pp. 2878–2887, 2012
 H. Wang, C. Hartmann, C. Schütte, and L. Delle Site, “Grand-Canonical-Like Molecular-Dynamics Simulations by Using an Adaptive-Resolution Technique,” Phys. Rev. X, vol. 3, p. 011018, 2013
 C. Krekeler, A. Agarwal, C. Junghans, M. Prapotnik and L. Delle Site, “Adaptive resolution molecular dynamics technique: Down to the essential”, J. Chem. Phys. 149, 024104
 B. Duenweg, J. Castagna, S. Chiacchera, H. Kobayashi, and C. Krekeler, “D4.3: Meso– and multi–scale modelling E-CAM modules II”, March 2018 . [Online]. Available: https://doi.org/10.5281/zenodo.1210075
 B. Duenweg, J. Castagna, S. Chiacchera, and C. Krekeler, “D4.4: Meso– and multi–scale modelling E-CAM modules III”, Jan 2019 . [Online]. Available: https://doi.org/10.5281/zenodo.2555012
The Innovation Radar aims to identify high-potential innovations and innovators. It is an important source of actionable intelligence on innovations emerging from research and innovation projects funded through European Union programmes.
E-CAM is associated to the following Innovations (Innovation topic: excellence science):
Improved Simulation Software Packages for Molecular Dynamics (see link)
Improved software modules for Meso– and multi–scale modelling (see link)
Related to the work of our E-CAM funded Postdoctoral researchers supervised by scientists in the team, working on:
A procedure for the construction of a particle and energy reservoir for the simulation of open molecular systems is presented. The reservoir is made of non‐interacting particles (tracers), embedded in a mean‐field. The tracer molecules acquire atomistic resolution upon entering the atomistic region, while atomistic molecules become tracers after crossing the atomistic boundary.
The simulation of open molecular systems requires explicit or implicit reservoirs of energy and particles. Whereas full atomistic resolution is desired in the region of interest, there is some freedom in the implementation of the reservoirs. Here, a combined, explicit reservoir is constructed by interfacing the atomistic region with regions of point-like, non-interacting particles (tracers) embedded in a thermodynamic mean field. The tracer molecules acquire atomistic resolution upon entering the atomistic region and equilibrate with this environment, while atomistic molecules become tracers governed by an effective mean-field potential after crossing the atomistic boundary. The approach is extensively tested on thermodynamic, structural, and dynamic properties of liquid water. Conceptual and numerical advantages of the procedure as well as new perspectives are highlighted and discussed.
In the context of the EU H2020 project E-CAM we are seeking a highly qualified post-doctoral researcher for an exciting collaborative project on the fundamental challenges of driven transport in complex media.
Increasingly, modern technology is addressing problems where fluid transport takes place in submicron sized channels, or in pores. The physical laws of transport in such channels are qualitatively different from those that determine bulk flow; they are poorly understood and, importantly, barely exploited.
The postdoctoral position will address complementary aspects related to the fundamental challenges of thermodynamic driving on systems of potential industrial interest. In this respect, the project will be developed in close contact with an industrial partner.
The project will involve both algorithmic and scientific developments. The candidate will benefit from existing in-house expertise in lattice Boltzmann methods for non-equilibrium soft materials and will contribute to its extension and use on complex materials out of equilibrium. The project will go beyond the state-of-the-art macroscopic descriptions of phoresis to capture the effects of solute and surface specificity, solute flexibility, surface wettability and heterogeneity, fluctuations and correlations.
We seek motivated researchers, with theoretical and computational expertise. Candidates should have a background in computer simulation, statistical mechanics, biophysics and/or soft condensed matter.
The project will be carried out at the University of Barcelona, under the supervision of Prof. Ignacio Pagonabarraga, for an initial period of 20 months. Candidates with an appropriate background, who are interested in a cutting-edge research at the interface between physics and the biological sciences, are invited to apply.
We look forward to receiving a CV and 1 referee letter. You can address these documents, or any additional information you require, to Prof. I. Pagonabarraga by email email@example.com. Review of applications will continue until the position is filled.