December Module of the Month: Load balancing for multi-GPU DL_MESO

 

Description

This module concerns the implementation of the E-CAM Load Balancing Library (ALL) in the multi-GPU version of DL_MESO_DPD code. The intention is to allow for better performance when modelling complex systems with DL_MESO_DPD, like large proteins or lipid bilayers, redistributing the work load across the GPUs.

ALL provides several schemes to find the ideal split of the work load : Tensor-Product method, Staggered Grid Method, Unstructured Mesh Method, Voronoi Mesh Method and Histogram-based Staggered Grid Method. Due to the orthogonal domain decomposition used in DL_MESO, the Tensor-Product scheme was used, which works well for non-staggered orthogonal meshes.

Practical application

A test case was implemented (see Figure 1 a), b) and c)) that reproduces 32k water beads initially scattered along a regular structure and then slowly agglomerating towards an unique large drop confined between two parallel surfaces. The system is divided across 8 GPUs and, for the purposes of the visualisation, we restrict ourselves to 32k particles. For a larger number of particles it would not be possible to simulate the system without load-balancing, since all the particles agglomerate to a subset of the available GPUs and one or more GPUs would run out of memory having to accommodate a large number of particles. Moreover, such a strong load imbalance drastically reduces the scalability of the application.

In Figure d) we see the time history of the load imbalance for each GPU when using the ALL library. Without load balancing the system would gradually diverge from the ideal value of 12.5%. You can find a video that shows the evolution of the load-balancing for this system in another software module.

Figure 1: Load imbalance in DL_MESO with ALL library for a water drop between two surfaces. Each colour represents different domain assigned to a different GPU: a) top view, b) perspective view, c) front view, d) load balance vs time

Source code

Further details on the implementation of ALL library in DL_MESO and the source code can be found in the E-CAM software repository here.

Share

November Module of the Month: PerGauss, Periodic Boundary Conditions for gaussian bases

 

Description

The module PerGauss (Per iodic Gauss ians) consists on an implementation of periodic boundary conditions for gaussian bases for the Quantics program package.

In quantum dynamics, the choice of coordinates is crucial to obtain meaningful results. While xyz or normal mode coordinates are linear and do not need a periodical treatment, particular angles, such as dihedrals, must be included to describe accurately the (photo-)chemistry of the system under consideration. In these cases, periodicity can be taken into account, since the value of the wave function and hamiltonian repeats itself after certain intervals.

Practical application

The module is expected to provide the quantum dynamics community with a more efficient way of treating large systems whose excited state driving forces involve periodic coordinates. When used on precomputed potentials (in G-MCTDH and vMCG), the model can improve the convergence since smaller grid sizes are needed. Used on-the-fly, it reduces considerably the amount of electronic structure computations needed compared to cartesian coordinates, since conformations that seemed far in the spanned space may be closer after applying a periodic transformation.

Source code

Currently PerGauss resides within the Quantics software package available upon request through gitlab. For more information see the PerGauss documentation here.

Share

Dask-traj

 

For analysis of molecular dynamics (MD) simulations MDTraj is a fast and commonly used analysis. However MDTraj has some restrictions such as (1) the whole trajectory needs to fit into memory, or gathering results becomes inconvenient; (2) the result of the computation also need to fit into memory, and (3) all processes need access to all the memory, preventing out-of-machine parallelisation and HPC scaling.

Dask-traj solves these restrictions by rewriting the MDTraj functions to work with Dask in order to achieve out-of-memory computations. Combined with dask-distributed this allows for out-of-machine parallelisation, essential for HPCs, and results in a (surprising) speed-up even on a single machine.

Source code

The source code for this module, and modules that build on it, is hosted at https://github.com/sroet/dask-traj

Share

CLstunfti: An extendable Python toolbox to compute scattering of electrons with a given kinetic energy in liquids and amorphous solids

 

Description

CLstunfti is an extendable Python toolbox to compute scattering of electrons with a given kinetic energy in liquids and amorphous solids. It uses a continuum trajectory model with differential ionization and scattering cross sections as input to simulate the motion of the electrons through the medium.

Originally, CLstunfti was developed to simulate two experiments: A measurement of the effective attenuation length (EAL) of photoelectrons in liquid water [1] and a measurement of the photoelectron angular distribution (PAD) of photoelectrons in liquid water [2]. These simulations were performed to determine the elastic mean free path (EMFP) and the inelastic mean free path (IMFP) of liquid water [3].

Practical application

The EMFP and IMFP are two central theoretical parameters of every simulation of electron scattering in liquids, but they are not directly accessible experimentally. As CLstunfti can be used to determine the EMFP and IMFP from experimental data, and as it can be easily extended to simulate other problems of particle scattering in liquids, it was decided to make the source code publicly available. For this purpose, within the E-CAM module, the necessary steps were taken to make CLstunfti a useful toolbox for other researchers by providing a documentation, examples, and also extensive inline documentation of the source code.

Source code

CLstunfti is available at https://gitlab.com/axelschild/CLstunfti .

 

References

[1] Suzuki, Nishizawa, Kurahashi, Suzuki, Effective attenuation length of an electron in liquid water between 10 and 600 eV, Phys. Rev. E 90, 010302 (2014)

[2] Thürmer, Seidel, Faubel, Eberhardt, Hemminger, Bradforth, Winter, Photoelectron Angular Distributions from Liquid Water: Effects of Electron Scattering, Phys. Rev. Lett. 111, 173005 (2013)

[3] Schild, Peper, Perry, Rattenbacher, Wörner, Alternative approach for the determination of mean free paths of electron scattering in liquid water based on experimental data, J. Phys. Chem. Lett., 11, 1128−1134 (2020)

Share

Minimal distance segment to segment with Karush-Kuhn-Tucker conditions

 

Description

The module minDist2segments_KKT returns the minimal distance between two line segments. It uses the Karush-Kuhn-Tucker conditions (KKT) for the minimization under constraints.

Practical application

We use the present module to avoid topology violations in an entangled polymer system. To preserve the topology in a system of entangled polymers we need to determine the minimal distance between two bonds. Once done we can apply either a soft or hard core potential to avoid the crossing of two bonds. Here, we propose to determine the minimal distance between two segments with the help of the Karush-Kuhn-Tucker conditions.

This module is a part of an E-CAM pilot project at the ENS Lyon, focused on the implementation of contact joint to resolve excluded volume constraints

Background information

A detailed derivation of the minimal distance between two segments using the Karush-Kuhn-Tucker conditions is available at  https://gitlab.e-cam2020.eu:10443/carrivain/mindist2segments_kkt/-/blob/master/minDist2segments_KKT.pdf

This module is used by other ongoing work, such as module velocities_resolve_EV, that resolves the excluded volume constraint  with a velocity formulation.

Source code

The source code and more information can be found at minDist2segments_KKT GitLab repository.

Share

QMCPack Interfaces for Electronic Structure Computations

Quantum Monte Carlo (QMC) methods are a class of ab initio, stochastic techniques for the study of quantum systems. While QMC simulations are computationally expensive, they have the advantage of being accurate, fully ab initio and scalable to a large number of cores with limited memory requirements.

These features make QMC methods a valuable tool to assess the accuracy of DFT computations, which are widely used in the fields of condensed matter physics, quantum chemistry and material science.

QMCPack is a free package for QMC simulations of electronic structure developed in several national labs in the US. This package is written in object oriented C++, offers a great flexibility in the choice of systems, trial wave functions and QMC methods and supports massive parallelism and the usage of GPUs.

Trial wave functions for electronic QMC computations commonly require the use of  single electrons orbitals, typically computed by DFT. The aim of the E-CAM pilot project described here is to build interfaces between QMCPack and other softwares for electronic structure computations, e.g. the DFT code Quantum Espresso.

These interfaces are used to manage the orbital reading or their DFT generation within QMCPack, to establish an automated, black box workflow for QMC computations. QMC simulation can for example be used in the benchmark and validation of DFT calculations: such a procedure can be employed in the study of several physical systems of interest in condensed matter physics, chemistry or material science, with application in the industry, e.g. in the study of metal-ion or water-carbon interfaces.

The following modules have been built as part of this pilot project:

  • QMCQEPack, that provides the files to download and  properly patch Quantum Espresso 5.3 to build the libpwinterface.so library; this library is required to use the module ESPWSCFInterface to generate single particle orbitals during a QMCPack computation using Quantum Espresso.
  • ESInterfaceBase that provides a base class for a general interface to generate single particle orbitals to be used in QMC simulations in QMCPack; implementations of specific interfaces as derived classes of ESInterfaceBase are available as the separate modules as follows:

The documentation about interfaces in QMCPack, can be seen in the QMCPack user manual at https://github.com/michruggeri/qmcpack/blob/f88a419ad1a24c68b2fdc345ad141e05ed0ab178/manual/interfaces.tex

Share

PANNA: Properties from Artificial Neural Network Architectures

PANNA is a package for training and validating neural networks to represent atomic potentials. It implements configurable all-to-all connected deep neural network architectures which allow for the exploration of training dynamics. Currently it includes tools to enable original[1] and modified[2] Behler-Parrinello input feature vectors, both for molecules and crystals, but the network can also be used in an input-agnostic fashion to enable further experimentation. PANNA is written in Python and relies on TensorFlow as underlying engine.

A common way to use PANNA in its current implementation is to train a neural network in order to estimate the total energy of a molecule or crystal, as a sum of atomic contributions, by learning from the data of reference total energy calculations for similar structures (usually ab-initio calculations).

The neural network models in literature often start from a description of the system of interest in terms of local feature vectors for each atom in the configuration. PANNA provides tools to calculate two versions of the Behler-Parrinello local descriptors but it allows the use of any species-resolved, fixed-size array that describes the input data.

PANNA allows the construction of neural network architectures with different sizes for each of the atomic species in the training set. Currently the allowed architecture is a deep neural network of fully connected layers, starting from the input feature vector and going through one or more hidden layers. The user can determine to train or freeze any layer, s/he can also transfer network parameters between species upon restart.

In summary, PANNA is an easy-to-use interface for obtaining neural network models for atomistic potentials, leveraging the highly optimized TensorFlow infrastructure to provide an efficient and parallelized, GPU-accelerated training.

It provides:

  • an input creation tool (atomistic calculation result -> G-vector )
  • an input packaging tool for quick processing of TensorFlow ( G-vector -> TFData bundle)
  • a network training tool
  • a network validation tool
  • a LAMMPS plugin
  • a bundle of sample data for testing[3]

See the full documentation of PANNA at https://gitlab.com/PANNAdevs/panna/blob/master/doc/PANNA_documentation.md

GitLab repository for PANNA: https://gitlab.com/PANNAdevs/panna

See manuscript at https://arxiv.org/abs/1907.03055

References

[1] J. Behler and M. Parrinello, “Generalized Neural-Network Representation of High-Dimensional  Potential-Energy Surfaces”, Phys. Rev. Lett. 98, 146401 (2007)

[2] Justin S. Smith, Olexandr Isayev, Adrian E. Roitberg, “ANI-1: An extensible neural network potential with DFT accuracy at force field computational cost», Chemical Science,(2017), DOI: 10.1039/C6SC05720A

[3] Justin S. Smith, Olexandr Isayev, Adrian E. Roitberg, “ANI-1, A data set of 20 million calculated off-equilibrium conformations for organic molecules; Scientific Data, 4 (2017), Article number: 170193, DOI: 10.1038/sdata.2017.193

Share

Pyscal- A python module for structural analysis of atomic environments

Description

pyscal is a python module for the calculation of local atomic structural environments including Steinhardt’s bond orientational order parameters[1] during post-processing of atomistic simulation data. The core functionality of pyscal is written in C++ with python wrappers using pybind11 which allows for fast calculations and easy extensions in python.

Practical Applications

Steinhardt’s order parameters are widely used for the identification of crystal structures [3]. They are also used to distinguish if an atom is in a solid or liquid environment [4]. pyscal is inspired by the BondOrderAnalysis code, but has since incorporated many additional features and modifications. The pyscal module includes the following functionalities:

  • calculation of Steinhardt’s order parameters and their averaged version [2].
  • links with the Voro++ code, for the calculation of Steinhardt parameters weighted using the face areas of Voronoi polyhedra [3].
  • classification of atoms as solid or liquid [4].
  • clustering of particles based on a user defined property.
  • methods for calculating radial distribution functions, Voronoi volumes of particles, number of vertices and face area of Voronoi polyhedra, and coordination numbers.

Background information

See the application documentation for full details. A paper about pyscal is also available in Ref. [5].

The utilisation of Dask within the project came about as a result of the E-CAM High Throughput Computing ESDW held in Turin in 2018 and 2019.

The software module was developed by Sarath Menon, Grisell Díaz Leines and Jutta Rogal, and is under a GNU General Public License v3.0.

References

[1] Steinhardt, P. J., Nelson, D. R., & Ronchetti, M. (1983). Physical Review B, 28.

[2] Lechner, W., & Dellago, C. (2008). The Journal of Chemical Physics, 129.

[3] (12) Mickel, W., Kapfer, S. C., Schröder-Turk, G. E., & Mecke, K. (2013). The Journal of Chemical Physics, 138.

[4] (12) Auer, S., & Frenkel, D. (2005). Advances in Polymer Science, 173.

[5] Menon, S., Díaz Leines, G., & Rogal, J.(2019). pyscal: A python module for structural analysis of atomic environments. Journal of Open Source Software, 4(43), 1824

Share

Multi-GPU version of DL_MESO_DPD

This module implements the first version of the DL_MESO_DPD Mesoscale Simulation Package, with multiple NVidia Graphical Processing Units (GPUs).

In this module the main framework of a multi-GPU version of the DL_MESO_DPD code has been developed. The exchange of data between GPUs overlaps with the computation of the forces for the internal cells of each partition (a domain decomposition approach based on the MPI parallel version of DL_MESO_DPD has been followed). The current implementation is a proof of concept and relies on slow transfers of data from the GPU to the host and vice-versa. Faster implementations will be explored in future modules.

Future plans include benchmarking of the code with different data transfer implementations other than the current (trivial) GPU-host-GPU transfer mechanism. These are: of Peer To Peer communication within a node, CUDA-aware MPI, and CUDA-aware MPI with Direct Remote Memory Access (DRMA).

Practical application and exploitation of the code

Dissipative Particle Dynamics (DPD) is routinely used in an industrial context to find out the static and dynamic behaviour of soft-matter systems. Examples include colloidal dispersions, emulsions and other amphiphilic systems, polymer solutions, etc. Such materials are being produced or processed in industries like cosmetics, food, pharmaceutics, biomedicine, etc. Porting the method to GPUs is thus inherently useful in order to provide cheaper calculations.

See more information in the industry success story recently reported by E-CAM.

Software documentation and link to the source code can be found in our E-CAM software Library here.

Share

Integrating LAMMPS with OpenPathSampling

This module shows how LAMMPS can be used as Molecular Dynamic (MD) engine in OpenPathSampling (OPS) and it also provide a benchmark for the impact of OPS overhead over the MD engine.

Practical application and exploitation of the code

OpenPathSampling uses OpenMM as default engine for calculating the sampled trajectories. Other engines as GROMACS and LAMMPS can be used (despite not yet available in the official release) allowing to exploit different computer architectures like hybrid CPU-GPU and to simulate more complex problems.

In this module we present the source code for the integration of OPS with LAMMPS as well as a benchmark for of a simple test case to show the impact on the performance due to OPS overhead.

Software documentation and link to the source code can be found in our E-CAM software Library here.

Share