E-CAM High Throughput Computing Library

This module is the first in a sequence that will form the overall capabilities of the E-CAM High Throughout Computing (HTC) library. In particular this module deals with creating a set of decorators to wrap around the Dask-Jobqueue Python library, which aspires to make the development time cost of leveraging it lower for our use cases.

The initial motivation for this library is driven by the ensemble-type calculations that are required in many scientific fields, and in particular in the materials science domain in which the E-CAM Centre of Excellence operates.

One specific application for this module is the study of “rare events” in theoretical and computational chemistry, a particularly relevant topic for E-CAM . Many problems in biological chemistry, materials science, and other fields involve events that only spontaneously occur after a millisecond or longer (for example, biomolecular conformational changes, or nucleation processes). That means that around 1012 time steps would be needed to see a single millisecond-scale event.

Modern supercomputers are beginning to make it possible to obtain trajectories long enough to observe some of these processes, but to fully characterize a transition with proper statistics, many examples are needed. In order to obtain many examples the same application must be run many thousands of times with varying inputs. To manage this kind of computation a task scheduling high throughput computing (HTC) library is needed. The main elements of the mentioned scheduling library are: task definition, task scheduling and task execution.

While traditionally an HTC workload is looked down upon in the HPC space, the scientific use case for extreme-scale resources exists and algorithms that require a coordinated approach make efficient libraries that implement this approach increasingly important in the HPC space. The 5 Petaflop booster technology of JURECA is an interesting concept with respect to this approach since the offloading approach of heavy computation marries perfectly to the concept outlined here.

Module documentation at https://e-cam.readthedocs.io/en/latest/Classical-MD-Modules/modules/HTC/decorators/readme.html

Share

Automated high-throughput Wannierisation, a successful collaboration between E-CAM and the MaX Centre of Excellence

Maximally-localised Wannier functions (MLWFs) are routinely used to compute from first- principles advanced materials properties that require very dense Brillouin zone (BZ) integration and to build accurate tight-binding models for scale-bridging simulations. At the same time, high-thoughput (HT) computational materials design is an emergent field that promises to accelerate the reliable and cost-effective design and optimisation of new materials with target properties. The use of MLWFs in HT workflows has been hampered by the fact that generating MLWFs automatically and robustly without any user intervention and for arbitrary materials is, in general, very challenging. We address this problem directly by proposing a procedure for automatically generating MLWFs for HT frameworks. Our approach is based on the selected columns of the density matrix method (SCDM, see SCDM Wannier Functions) and is implemented in an AiiDA workflow.

Purpose of the module

Create a fully-automated protocol based on the SCDM algorithm for the construction of MLWFs, in which the two free parameters are determined automatically (in our HT approach the dimensionality of the disentangled space is fixed by the total number of states used to generate the pseudopotentials in the DFT calculations).

A paper describing the work is available at https://arxiv.org/abs/1909.00433, where this approach was applied to a dataset of 200 bulk crystalline materials that span a wide structural and chemical space.

Background information

This module is a collaboration between E-CAM and the MaX Centre of Excellence.

In the SCDM Wannier Functions module, E-CAM has implemented the SCDM algorithm in the pw2wannier90.f90 interface code between the Quantum ESPRESSO software and the Wannier90 code. This implementation was used as the basis for a complete computational workflow for obtaining MLWFs and electronic properties based on Wannier interpolation of the BZ, starting only from the specification of the initial crystal structure. The workflow was implemented within the AiiDA materials informatics platform, and used to perform a HT study on a dataset of 200 materials, as described in here.

More information at https://e-cam.readthedocs.io/en/latest/Electronic-Structure-Modules/modules/W90_MaX_collab/readme.html

Share

QMCPack Interfaces for Electronic Structure Computations

Quantum Monte Carlo (QMC) methods are a class of ab initio, stochastic techniques for the study of quantum systems. While QMC simulations are computationally expensive, they have the advantage of being accurate, fully ab initio and scalable to a large number of cores with limited memory requirements.

These features make QMC methods a valuable tool to assess the accuracy of DFT computations, which are widely used in the fields of condensed matter physics, quantum chemistry and material science.

QMCPack is a free package for QMC simulations of electronic structure developed in several national labs in the US. This package is written in object oriented C++, offers a great flexibility in the choice of systems, trial wave functions and QMC methods and supports massive parallelism and the usage of GPUs.

Trial wave functions for electronic QMC computations commonly require the use of  single electrons orbitals, typically computed by DFT. The aim of the E-CAM pilot project described here is to build interfaces between QMCPack and other softwares for electronic structure computations, e.g. the DFT code Quantum Espresso.

These interfaces are used to manage the orbital reading or their DFT generation within QMCPack, to establish an automated, black box workflow for QMC computations. QMC simulation can for example be used in the benchmark and validation of DFT calculations: such a procedure can be employed in the study of several physical systems of interest in condensed matter physics, chemistry or material science, with application in the industry, e.g. in the study of metal-ion or water-carbon interfaces.

The following modules have been built as part of this pilot project:

  • QMCQEPack, that provides the files to download and  properly patch Quantum Espresso 5.3 to build the libpwinterface.so library; this library is required to use the module ESPWSCFInterface to generate single particle orbitals during a QMCPack computation using Quantum Espresso.
  • ESInterfaceBase that provides a base class for a general interface to generate single particle orbitals to be used in QMC simulations in QMCPack; implementations of specific interfaces as derived classes of ESInterfaceBase are available as the separate modules as follows:

The documentation about interfaces in QMCPack, can be seen in the QMCPack user manual at https://github.com/michruggeri/qmcpack/blob/f88a419ad1a24c68b2fdc345ad141e05ed0ab178/manual/interfaces.tex

Share

PANNA: Properties from Artificial Neural Network Architectures

PANNA is a package for training and validating neural networks to represent atomic potentials. It implements configurable all-to-all connected deep neural network architectures which allow for the exploration of training dynamics. Currently it includes tools to enable original[1] and modified[2] Behler-Parrinello input feature vectors, both for molecules and crystals, but the network can also be used in an input-agnostic fashion to enable further experimentation. PANNA is written in Python and relies on TensorFlow as underlying engine.

A common way to use PANNA in its current implementation is to train a neural network in order to estimate the total energy of a molecule or crystal, as a sum of atomic contributions, by learning from the data of reference total energy calculations for similar structures (usually ab-initio calculations).

The neural network models in literature often start from a description of the system of interest in terms of local feature vectors for each atom in the configuration. PANNA provides tools to calculate two versions of the Behler-Parrinello local descriptors but it allows the use of any species-resolved, fixed-size array that describes the input data.

PANNA allows the construction of neural network architectures with different sizes for each of the atomic species in the training set. Currently the allowed architecture is a deep neural network of fully connected layers, starting from the input feature vector and going through one or more hidden layers. The user can determine to train or freeze any layer, s/he can also transfer network parameters between species upon restart.

In summary, PANNA is an easy-to-use interface for obtaining neural network models for atomistic potentials, leveraging the highly optimized TensorFlow infrastructure to provide an efficient and parallelized, GPU-accelerated training.

It provides:

  • an input creation tool (atomistic calculation result -> G-vector )
  • an input packaging tool for quick processing of TensorFlow ( G-vector -> TFData bundle)
  • a network training tool
  • a network validation tool
  • a LAMMPS plugin
  • a bundle of sample data for testing[3]

See the full documentation of PANNA at https://gitlab.com/PANNAdevs/panna/blob/master/doc/PANNA_documentation.md

GitLab repository for PANNA: https://gitlab.com/PANNAdevs/panna

See manuscript at https://arxiv.org/abs/1907.03055

References

[1] J. Behler and M. Parrinello, “Generalized Neural-Network Representation of High-Dimensional  Potential-Energy Surfaces”, Phys. Rev. Lett. 98, 146401 (2007)

[2] Justin S. Smith, Olexandr Isayev, Adrian E. Roitberg, “ANI-1: An extensible neural network potential with DFT accuracy at force field computational cost», Chemical Science,(2017), DOI: 10.1039/C6SC05720A

[3] Justin S. Smith, Olexandr Isayev, Adrian E. Roitberg, “ANI-1, A data set of 20 million calculated off-equilibrium conformations for organic molecules; Scientific Data, 4 (2017), Article number: 170193, DOI: 10.1038/sdata.2017.193

Share

Pyscal- A python module for structural analysis of atomic environments

Description

pyscal is a python module for the calculation of local atomic structural environments including Steinhardt’s bond orientational order parameters[1] during post-processing of atomistic simulation data. The core functionality of pyscal is written in C++ with python wrappers using pybind11 which allows for fast calculations and easy extensions in python.

Practical Applications

Steinhardt’s order parameters are widely used for the identification of crystal structures [3]. They are also used to distinguish if an atom is in a solid or liquid environment [4]. pyscal is inspired by the BondOrderAnalysis code, but has since incorporated many additional features and modifications. The pyscal module includes the following functionalities:

  • calculation of Steinhardt’s order parameters and their averaged version [2].
  • links with the Voro++ code, for the calculation of Steinhardt parameters weighted using the face areas of Voronoi polyhedra [3].
  • classification of atoms as solid or liquid [4].
  • clustering of particles based on a user defined property.
  • methods for calculating radial distribution functions, Voronoi volumes of particles, number of vertices and face area of Voronoi polyhedra, and coordination numbers.

Background information

See the application documentation for full details. A paper about pyscal is also available in Ref. [5].

The utilisation of Dask within the project came about as a result of the E-CAM High Throughput Computing ESDW held in Turin in 2018 and 2019.

The software module was developed by Sarath Menon, Grisell Díaz Leines and Jutta Rogal, and is under a GNU General Public License v3.0.

References

[1] Steinhardt, P. J., Nelson, D. R., & Ronchetti, M. (1983). Physical Review B, 28.

[2] Lechner, W., & Dellago, C. (2008). The Journal of Chemical Physics, 129.

[3] (12) Mickel, W., Kapfer, S. C., Schröder-Turk, G. E., & Mecke, K. (2013). The Journal of Chemical Physics, 138.

[4] (12) Auer, S., & Frenkel, D. (2005). Advances in Polymer Science, 173.

[5] Menon, S., Díaz Leines, G., & Rogal, J.(2019). pyscal: A python module for structural analysis of atomic environments. Journal of Open Source Software, 4(43), 1824

Share

Multi-GPU version of DL_MESO_DPD

This module implements the first version of the DL_MESO_DPD Mesoscale Simulation Package, with multiple NVidia Graphical Processing Units (GPUs).

In this module the main framework of a multi-GPU version of the DL_MESO_DPD code has been developed. The exchange of data between GPUs overlaps with the computation of the forces for the internal cells of each partition (a domain decomposition approach based on the MPI parallel version of DL_MESO_DPD has been followed). The current implementation is a proof of concept and relies on slow transfers of data from the GPU to the host and vice-versa. Faster implementations will be explored in future modules.

Future plans include benchmarking of the code with different data transfer implementations other than the current (trivial) GPU-host-GPU transfer mechanism. These are: of Peer To Peer communication within a node, CUDA-aware MPI, and CUDA-aware MPI with Direct Remote Memory Access (DRMA).

Practical application and exploitation of the code

Dissipative Particle Dynamics (DPD) is routinely used in an industrial context to find out the static and dynamic behaviour of soft-matter systems. Examples include colloidal dispersions, emulsions and other amphiphilic systems, polymer solutions, etc. Such materials are being produced or processed in industries like cosmetics, food, pharmaceutics, biomedicine, etc. Porting the method to GPUs is thus inherently useful in order to provide cheaper calculations.

See more information in the industry success story recently reported by E-CAM.

Software documentation and link to the source code can be found in our E-CAM software Library here.

Share

Integrating LAMMPS with OpenPathSampling

This module shows how LAMMPS can be used as Molecular Dynamic (MD) engine in OpenPathSampling (OPS) and it also provide a benchmark for the impact of OPS overhead over the MD engine.

Practical application and exploitation of the code

OpenPathSampling uses OpenMM as default engine for calculating the sampled trajectories. Other engines as GROMACS and LAMMPS can be used (despite not yet available in the official release) allowing to exploit different computer architectures like hybrid CPU-GPU and to simulate more complex problems.

In this module we present the source code for the integration of OPS with LAMMPS as well as a benchmark for of a simple test case to show the impact on the performance due to OPS overhead.

Software documentation and link to the source code can be found in our E-CAM software Library here.

Share

FFTXlib, a rewrite and optimisation of earlier versions of FFT related routines inside QE pre-v6

FFTXlib is mainly a rewrite and optimisation of earlier versions of FFT related routines inside Quantum ESPRESSO (QE) pre-v6; and finally their replacement. Despite many similarities, current version of FFTXlib dramatically changes the FFT strategy in the parallel execution, from 1D+2D FFT performed in QE pre v6 to a 1D+1D+1D one; to allow for greater flexibility in parallelisation.

Practical application and exploitation of the code

FFTXlib module is a collection of driver routines that allows the user to perform complex 3D fast Fourier transform (FFT) in the context of plane wave based electronic structure software. It contains routines to initialize the array structures, to calculate the desired grid shapes. It imposes underlying size assumptions and provides correspondence maps for indices between the two transform domains.

Once this data structure is constructed, forward or inverse in-place FFT can be performed. For this purpose FFTXlib can either use a local copy of an earlier version of FFTW (a commonly used open source FFT library), or it can also serve as a wrapper to external FFT libraries via conditional compilation using pre-processor directives. It supports both MPI and OpenMP parallelisation technologies.

FFTXlib is currently employed within Quantum Espresso package, a widely used suite of codes for electronic structure calculations and materials modeling in the nanoscale, based on planewave and pseudopotentials.

FFTXlib is also interfaced with “miniPWPP” module that solves the Kohn Sham equations in the basis of planewaves and soon to be released as a part of E-CAM Electronic Structure Library.

Software documentation and link to the source code can be found in our E-CAM software Library here.

Share

Extension of the ParaDiS code to include precipitate interactions, and code optimisation to run on HPC environment


Here present two featured software modules of the month:

  1. ParaDiS with precipitates
  2. ParaDiS with precipitates optimized to HPC environment

that provide extensions to the ParaDIS Discrete dislocation dynamics (DDD) code (LLNL, http://paradis.stanford.edu/) where dislocation/precipitate interactions are included. Module 2 was built to run the code on an HPC environment, by optimizing the original code for the Cray XC40 cluster at CSC in Finland. Software was developed by E-CAM partners at CSC and Aalto University (Finland).

Practical application and exploitation of the codes

The ParaDiS code is a free large scale dislocation dynamics (DD) simulation code to study the fundamental mechanisms of plasticity. However, DDD simulations don’t always take into account scenarios of impurities interacting with the dislocations and their motion. The consequences of the impurities are multiple: the yield stress is changed, and in general the plastic deformation process is greatly affected. Simulating these by DDD allows to look at a large number of issues from materials design to controlling the yield stress and may be done in a multiscale manner by computing the dislocation-precipitate interactions from microscopic simulations or by coarse-graining the DDD results for the stress-strain curves on the mesoscopic scale to more macroscopic Finite Element Method.

Modules 1 and 2 provide therefore an extension of the ParaDIS code by including dislocation/precipitate interactions. The possibility to run the code on HPC environments is also provided.

Software documentation and link to the source code can be found in our E-CAM software Library here.

Share

DBCSR@MatrixSwitch, an optimised library to deal with sparse matrices

MatrixSwitch is a module which acts as an intermediary interface layer between high-level and low-level routines dealing with matrix storage and manipulation. It allows a seamlessly switch between different software implementations of the matrix operations.

DBCSR is an optimized library to deal with sparse matrices, which appear frequently in many kind of numerical simulations.

In DBCSR@MatrixSwitch, DBCSR capabilities have been added to MatrixSwitch as an optional library dependency.

To carry out calculations in serial mode may be too slow sometimes and a parallelisation strategy is needed. Serial/parallel MatrixSwitch employs Lapack/ScaLapack to perform matrix operations, irrespective of their dense or sparse character. The disadvantage of the Lapack/ScaLapack schemes is that they are not optimised for sparse matrices. DBCSR provides the necessary algorithms to solve this problem and in addition is specially suited to work in parallel.

Direct link to module documentation: https://e-cam.readthedocs.io/en/latest/Electronic-Structure-Modules/modules/MatrixSwitchDBCSR/readme.html

Share