E-CAM researchers working at the Hartree Centre – Daresbury Laboratory have co-designed the DL_MESO Mesoscale Simulation package to run on multiple GPUs, and ran for the first time a Dissipative Particle Dynamics simulation of a very large system (1.8 billion particles) on 4096 GPUs.
Towards extreme scale dissipative particle dynamics simulations using multiple GPGPUs
J. Castagna, X. Guo, M. Seaton and A. O’Cais
Computer Physics Communications (2020) 107159
DOI: 10.1016/j.cpc.2020.107159 (open access)
A multi-GPGPU development for Mesoscale Simulations using the Dissipative Particle Dynamics method is presented. This distributed GPU acceleration development is an extension of the DL_MESO package to MPI+CUDA in order to exploit the computational power of the latest NVIDIA cards on hybrid CPU–GPU architectures. Details about the extensively applicable algorithm implementation and memory coalescing data structures are presented. The key algorithms’ optimizations for the nearest-neighbour list searching of particle pairs for short range forces, exchange of data and overlapping between computation and communications are also given. We have have carried out strong and weak scaling performance analyses with up to 4096 GPUs. A two phase mixture separation test case with 1.8 billion particles has been run on the Piz Daint supercomputer from the Swiss National Supercomputer Center. With CUDA aware MPI, proper GPU affinity, communication and computation overlap optimizations for multi-GPU version, the final optimization results demonstrated more than 94% efficiency for weak scaling and more than 80% efficiency for strong scaling. As far as we know, this is the first report in the literature of DPD simulations being run on this large number of GPUs. The remaining challenges and future work are also discussed at the end of the paper.
[button url=”https://www.e-cam2020.eu/calendar/” target=”_self” color=”primary”]Back to Calendar[/button]
If you are interested in attending this event, please visit the CECAM website here.
In Discrete Element Methods the equation of motion of large number of particles is numerically integrated to obtain the trajectory of each particle . The collective movement of the particles very often provides the system with unpredictable complex dynamics inaccessible via any mean field approach. Such phenomenology is present for instance in a seemingly simple systems such as the hopper/silo, where intermittent flow accompanied with random clogging occurs . With the development of computing power alongside that of the numerical algorithms it has become possible to simulate such scenarios involving the trajectories of millions of spherical particles for a limited simulation time. Incorporating more complex particle shapes  or the influence of the interstitial medium  rapidly decrease the accessible range of the number of particles.
Another class of computer simulations having a huge popularity among the science and engineering community is the Computational Fluid Dynamics (CFD). A tractable method for performing such simulations is the family of Lattice Boltzmann Methods (LBMs) . There, instead of directly solving the strongly non-linear Navier-Stokes equations, the discrete Boltzmann equation is solved to simulate the flow of Newtonian or non-Newtonian fluids with the appropriate collision models [6,7]. The method resembles a lot the DEMs as it simulates the the streaming and collision processes across a limited number of intrinsic particles, which evince viscous flow applicable across the greater mass.
As both of the methods have gained popularity in solving engineering problems, and scientists have become more aware of finite size effects, the size and time requirements to simulate practically relevant systems using these methods have escaped beyond the capabilities of even the most modern CPUs [8,9]. Massive parallelization is thus becoming a necessity. This is naturally offered by graphics processing units (GPUs) making them an attractive alternative for running these simulations, which consist of a large number of relatively simple mathematical operations readily implemented in a GPU [8,9].
 P.A. Cundall and O.D.L. Strack, Geotechnique 29, 47–65 (1979).
 H. G. Sheldon and D. J. Durian, Granular Matter 6, 579-585 (2010).
 A. Khazeni, Z. Mansourpour Powder Tech. 332, 265-278 (2018).
 J. Koivisto, M. Korhonen, M. J. Alava, C. P. Ortiz, D. J. Durian, A. Puisto, Soft Matter 13 7657-7664 (2017).
 S. Succi,The lattice Boltzmann equation: for fluid dynamics and beyond. Oxford university press, (2001).
 L. S. Luo, W. Liao, X. Chen, Y. Peng, W. Zhang, Phys. Rev. E, 83, 056710 (2011).
 S. Gabbanelli, G.Drazer, J. Koplik, Phys. Rev. E, 72, 046312 (2005).
 N Govender, R. K. Rajamani, S. Kok, D. N. Wilke, Minerals Engin. 79, 152-168 (2015).
 P.R. Rinaldi, E. A. Dari, M. J. Vénere, A. Clausse, Simulation Modelling Practice and Theory, 25, 163-171 (2012).
from smooth coupling
to a direct interface (abrupt)
Dr. Christian Krekeler, Freie Universität Berlin
GC-AdResS is a technique that speeds up computations without loss of accuracy for key system properties by dividing the simulation box into two or more regions having different levels of resolution, for instance a high resolution region where the molecules of the system are treated at an atomistic level of detail, and other regions where molecules are treated at a coarse grained level, and transition regions where a weighted average of the two resolutions is used. The goal of the E-CAM GC-AdResS pilot project was to eliminate the need of a transition region so as to significantly improve performance, and to allow much greater flexibility. For example, the low resolution region can be a particle reservoir (ranging in detail from coarse grained to ideal gas particles) and a high resolution atomistic region with no transition region, as was needed hitherto. The only requirement is that the two regions can exchange particles, and that a corresponding “thermodynamic” force is computed self-consistently, which it turns out is very simple to implement.Continue reading…
In the margins of a recent multiscale simulation workshop a discussion began between a prominent pharmaceutical industry scientist, and E-CAM and EMMC regarding the unfolding Fourth Industrial Revolution and the role of particle based simulation and statistical methods there. The impact of simulation is predicted to become very significant. This discussion is intended to create awareness of the general public, of how industry 4.0 is initiating in companies, and how academic research will support that transformation.
Authors: Prof. Pietro Asinari (EMMC and Politecnico di Torino, denoted below as PA) and Dr. Donal MacKernan (E-CAM and University College Dublin, denoted below as DM) , and a prominent pharmaceutical industry scientist (name withheld at author’s request as the view expressed is a personal one, denoted below as IS)Continue reading…
In this module the main framework of a multi-GPU version of the DL_MESO_DPD code has been developed. The exchange of data between GPUs overlaps with the computation of the forces for the internal cells of each partition (a domain decomposition approach based on the MPI parallel version of DL_MESO_DPD has been followed). The current implementation is a proof of concept and relies on slow transfers of data from the GPU to the host and vice-versa. Faster implementations will be explored in future modules.
Future plans include benchmarking of the code with different data transfer implementations other than the current (trivial) GPU-host-GPU transfer mechanism. These are: of Peer To Peer communication within a node, CUDA-aware MPI, and CUDA-aware MPI with Direct Remote Memory Access (DRMA).
Practical application and exploitation of the code
Dissipative Particle Dynamics (DPD) is routinely used in an industrial context to find out the static and dynamic behaviour of soft-matter systems. Examples include colloidal dispersions, emulsions and other amphiphilic systems, polymer solutions, etc. Such materials are being produced or processed in industries like cosmetics, food, pharmaceutics, biomedicine, etc. Porting the method to GPUs is thus inherently useful in order to provide cheaper calculations.
See more information in the industry success story recently reported by E-CAM.
Software documentation and link to the source code can be found in our E-CAM software Library here.
We would like to draw your attention to a school cum workshop on
CHALLENGES IN MULTIPHASE FLOWS
that will run on Dec 9-12, 2019, at the Monash University Prato Center,
see http://monash.it/, in Tuscany. The event is an E-CAM state-of-the-art
workshop, and its aim is to focus on computer
simulation methods for multiphase systems and their dynamics, and
their strengths and shortcomings. This is a topic that is relevant in
physics, mathematics, chemistry, and engineering, and we are trying to
bring these communities together for a fruitful exchange. At the same
time, a set of advanced lectures at the school is intended to provide
a solid foundation of background knowledge. For more information (in
particular, the list of Invited Speakers), see the
Registration is now open. Regular participants need to pay a fee of
500 Australian Dollars (roughly 300 Euros) for meals etc.; however the
first 25 students (with proven status) who register may attend for free.
DEADLINE for registration and abstract submission is September 22.
Please do not hesitate to contact the organisers (contact information on the main website for the event) if you feel you need more information beyond what is provided on the web.
Burkhard Duenweg, Mainz
Ravi Prakash Jagadeeshan, Melbourne
Ignacio Pagonabarraga, Lausanne
Here present two featured software modules of the month:
that provide extensions to the ParaDIS Discrete dislocation dynamics (DDD) code (LLNL, http://paradis.stanford.edu/) where dislocation/precipitate interactions are included. Module 2 was built to run the code on an HPC environment, by optimizing the original code for the Cray XC40 cluster at CSC in Finland. Software was developed by E-CAM partners at CSC and Aalto University (Finland).
Practical application and exploitation of the codes
The ParaDiS code is a free large scale dislocation dynamics (DD) simulation code to study the fundamental mechanisms of plasticity. However, DDD simulations don’t always take into account scenarios of impurities interacting with the dislocations and their motion. The consequences of the impurities are multiple: the yield stress is changed, and in general the plastic deformation process is greatly affected. Simulating these by DDD allows to look at a large number of issues from materials design to controlling the yield stress and may be done in a multiscale manner by computing the dislocation-precipitate interactions from microscopic simulations or by coarse-graining the DDD results for the stress-strain curves on the mesoscopic scale to more macroscopic Finite Element Method.
Modules 1 and 2 provide therefore an extension of the ParaDIS code by including dislocation/precipitate interactions. The possibility to run the code on HPC environments is also provided.
Software documentation and link to the source code can be found in our E-CAM software Library here.
E-CAM partners at Aalto University (CECAM Finish Node) in collaboration with the HPC training experts from the CSC Supercomputing Centre, are organizing a joint Extended Software Development Workshop from 15-19 October 2019 , aimed at people interested in particle based methods, such as the Discrete Element and Lattice Boltzmann Methods, and on their massive parallelization using GPU architectures. The workshop will mix three different ingredients: (1) workshop on state-of-the-art challenges in computational science and software, (2) CSC -run school, and (3) coding sessions with the aid of CSC facilities and expertise.
How to Apply
Follow the instruction at the CECAM website for the event: https://www.cecam.org/workshop1752/
- Mikko Alava
Aalto University, Finland
- Brian Tighe
TU Delft, The Netherlands
- Jan Astrom
CSC It center for science, Finland
- Antti Puisto
Aalto University, Finland
CECAM-FI Node, Aalto University, Finland
October 15 – 19, 2019
Dr. Jony Castagna, Science and Technology Facilities Council, United Kingdom
Jony Castagna recounts his transition from industry scientist to research software developer at the STFC, his E-CAM rewrite of DL_MESO allowing the simulation of billion atom systems on thousands of GPGPUs, and his latest role as Nvidia ambassador focused on machine learning.Continue reading…