Multi-GPU version of DL_MESO_DPD

This module implements the first version of the DL_MESO_DPD Mesoscale Simulation Package, with multiple NVidia Graphical Processing Units (GPUs).

In this module the main framework of a multi-GPU version of the DL_MESO_DPD code has been developed. The exchange of data between GPUs overlaps with the computation of the forces for the internal cells of each partition (a domain decomposition approach based on the MPI parallel version of DL_MESO_DPD has been followed). The current implementation is a proof of concept and relies on slow transfers of data from the GPU to the host and vice-versa. Faster implementations will be explored in future modules.

Future plans include benchmarking of the code with different data transfer implementations other than the current (trivial) GPU-host-GPU transfer mechanism. These are: of Peer To Peer communication within a node, CUDA-aware MPI, and CUDA-aware MPI with Direct Remote Memory Access (DRMA).

Practical application and exploitation of the code

Dissipative Particle Dynamics (DPD) is routinely used in an industrial context to find out the static and dynamic behaviour of soft-matter systems. Examples include colloidal dispersions, emulsions and other amphiphilic systems, polymer solutions, etc. Such materials are being produced or processed in industries like cosmetics, food, pharmaceutics, biomedicine, etc. Porting the method to GPUs is thus inherently useful in order to provide cheaper calculations.

See more information in the industry success story recently reported by E-CAM.

Software documentation and link to the source code can be found in our E-CAM software Library here.

Share

Integrating LAMMPS with OpenPathSampling

This module shows how LAMMPS can be used as Molecular Dynamic (MD) engine in OpenPathSampling (OPS) and it also provide a benchmark for the impact of OPS overhead over the MD engine.

Practical application and exploitation of the code

OpenPathSampling uses OpenMM as default engine for calculating the sampled trajectories. Other engines as GROMACS and LAMMPS can be used (despite not yet available in the official release) allowing to exploit different computer architectures like hybrid CPU-GPU and to simulate more complex problems.

In this module we present the source code for the integration of OPS with LAMMPS as well as a benchmark for of a simple test case to show the impact on the performance due to OPS overhead.

Software documentation and link to the source code can be found in our E-CAM software Library here.

Share

FFTXlib, a rewrite and optimisation of earlier versions of FFT related routines inside QE pre-v6

FFTXlib is mainly a rewrite and optimisation of earlier versions of FFT related routines inside Quantum ESPRESSO (QE) pre-v6; and finally their replacement. Despite many similarities, current version of FFTXlib dramatically changes the FFT strategy in the parallel execution, from 1D+2D FFT performed in QE pre v6 to a 1D+1D+1D one; to allow for greater flexibility in parallelisation.

Practical application and exploitation of the code

FFTXlib module is a collection of driver routines that allows the user to perform complex 3D fast Fourier transform (FFT) in the context of plane wave based electronic structure software. It contains routines to initialize the array structures, to calculate the desired grid shapes. It imposes underlying size assumptions and provides correspondence maps for indices between the two transform domains.

Once this data structure is constructed, forward or inverse in-place FFT can be performed. For this purpose FFTXlib can either use a local copy of an earlier version of FFTW (a commonly used open source FFT library), or it can also serve as a wrapper to external FFT libraries via conditional compilation using pre-processor directives. It supports both MPI and OpenMP parallelisation technologies.

FFTXlib is currently employed within Quantum Espresso package, a widely used suite of codes for electronic structure calculations and materials modeling in the nanoscale, based on planewave and pseudopotentials.

FFTXlib is also interfaced with “miniPWPP” module that solves the Kohn Sham equations in the basis of planewaves and soon to be released as a part of E-CAM Electronic Structure Library.

Software documentation and link to the source code can be found in our E-CAM software Library here.

Share

Extension of the ParaDiS code to include precipitate interactions, and code optimisation to run on HPC environment


Here present two featured software modules of the month:

  1. ParaDiS with precipitates
  2. ParaDiS with precipitates optimized to HPC environment

that provide extensions to the ParaDIS Discrete dislocation dynamics (DDD) code (LLNL, http://paradis.stanford.edu/) where dislocation/precipitate interactions are included. Module 2 was built to run the code on an HPC environment, by optimizing the original code for the Cray XC40 cluster at CSC in Finland. Software was developed by E-CAM partners at CSC and Aalto University (Finland).

Practical application and exploitation of the codes

The ParaDiS code is a free large scale dislocation dynamics (DD) simulation code to study the fundamental mechanisms of plasticity. However, DDD simulations don’t always take into account scenarios of impurities interacting with the dislocations and their motion. The consequences of the impurities are multiple: the yield stress is changed, and in general the plastic deformation process is greatly affected. Simulating these by DDD allows to look at a large number of issues from materials design to controlling the yield stress and may be done in a multiscale manner by computing the dislocation-precipitate interactions from microscopic simulations or by coarse-graining the DDD results for the stress-strain curves on the mesoscopic scale to more macroscopic Finite Element Method.

Modules 1 and 2 provide therefore an extension of the ParaDIS code by including dislocation/precipitate interactions. The possibility to run the code on HPC environments is also provided.

Software documentation and link to the source code can be found in our E-CAM software Library here.

Share

DBCSR@MatrixSwitch, an optimised library to deal with sparse matrices

MatrixSwitch is a module which acts as an intermediary interface layer between high-level and low-level routines dealing with matrix storage and manipulation. It allows a seamlessly switch between different software implementations of the matrix operations.

DBCSR is an optimized library to deal with sparse matrices, which appear frequently in many kind of numerical simulations.

In DBCSR@MatrixSwitch, DBCSR capabilities have been added to MatrixSwitch as an optional library dependency.

To carry out calculations in serial mode may be too slow sometimes and a parallelisation strategy is needed. Serial/parallel MatrixSwitch employs Lapack/ScaLapack to perform matrix operations, irrespective of their dense or sparse character. The disadvantage of the Lapack/ScaLapack schemes is that they are not optimised for sparse matrices. DBCSR provides the necessary algorithms to solve this problem and in addition is specially suited to work in parallel.

Direct link to module documentation: https://e-cam.readthedocs.io/en/latest/Electronic-Structure-Modules/modules/MatrixSwitchDBCSR/readme.html

Share

Abrupt GC-AdResS: A new and more general implementation of the Grand Canonical Adaptive Resolution Scheme (GC-AdResS)

The Grand Canonical Adaptive resolution scheme (GC-AdResS) gives a methodological description to partition a simulation box into different regions with different degrees of accuracy. For more details on the theory see Refs. [1,2,3].

In the context of an E-CAM pilot project focused on the development of the GC-AdResS scheme, an updated version of GC-AdResS was built and implemented in GROMACS, as reported in https://aip.scitation.org/doi/10.1063/1.5031206 (open access version: https://arxiv.org/abs/1806.09870). The main goal of the project is to develop a library or recipe with which GC-AdResS can be implemented in any Classical MD Code.

The current implementation of GC- AdResS in GROMACS has several performance problems. We know that the main performance loss of AdResS simulations in GROMACS is in the neighbouring list search and the generic serial force calculation linking the atomistic (AT) and coarse grained (CG) forces together via a smooth weighting function. Thus, to remove the bottleneck with respect to performance and a hindrance regarding the easy/general implementation into other codes and eliminate the non optimized force calculation, we had to change the neighbourlist search. This lead to a considerable speed up of the code. Furthermore it decouples the method directly from the core of any MD code, which does not hinder the performance and makes the scheme hardware independent[4].

This module presents a very straight forward way to implement a new partitioning scheme in GROMACS . And this solves two problems which affect the performance, the neighborlist search and the generic force kernel.

Information about module purpose, background information, software installation, testing and a link to the source code, can be found in our E-CAM software Library here.

E-CAM Deliverables D4.3[5] and D4.4[6] present more modules developed in the context of this pilot project.

References

[1] L. Delle Site and M. Praprotnik, “Molecular Systems with Open Boundaries: Theory and Simulation,” Phys. Rep., vol. 693, pp. 1–56, 2017

[2] H.Wang, C. Schütte, and L.Delle Site, “Adaptive Resolution Simulation (AdResS): A Smooth Thermodynamic and Structural Transition fromAtomistic to Coarse Grained Resolution and Vice Versa in a Grand Canonical Fashion,” J. Chem. Theory Comput., vol. 8, pp. 2878–2887, 2012

[3] H. Wang, C. Hartmann, C. Schütte, and L. Delle Site, “Grand-Canonical-Like Molecular-Dynamics Simulations by Using an Adaptive-Resolution Technique,” Phys. Rev. X, vol. 3, p. 011018, 2013

[4] C. Krekeler, A. Agarwal, C. Junghans, M. Prapotnik and L. Delle Site, “Adaptive resolution molecular dynamics technique: Down to the essential”, J. Chem. Phys. 149, 024104

[5] B. Duenweg, J. Castagna, S. Chiacchera, H. Kobayashi, and C. Krekeler, “D4.3: Meso– and multi–scale modelling E-CAM modules II”, March 2018 . [Online]. Available: https://doi.org/10.5281/zenodo.1210075

[6] B. Duenweg, J. Castagna, S. Chiacchera, and C. Krekeler, “D4.4: Meso– and multi–scale modelling E-CAM modules III”, Jan 2019 . [Online]. Available: https://doi.org/10.5281/zenodo.2555012

Share

Porting of electrostatics to the GPU version of DL_MESO_DPD


The porting of DL_MESO_DPD [1,2] to graphic cards (GPUs) was reported in deliverable D4.2 of E-CAM[3] (for a single GPU) and deliverable D4.3 [4] (for multiple GPUs) (Figure 1), and has now been extended to include electrostatics, with two alternative schemes as explained below. This work was recently reported on deliverable D4.4[5].

Figure 1: DL_MESO strong scaling results on PizDaint, obtained using 1.8 billion particles for 256 to 2048 GPUs. Results show very good scaling, with efficiency always above 89% for 2048 GPUs.


To allow Dissipative Particle Dynamics (DPD) methods to treat systems with electrically charged particles, several approaches have been proposed in the literature, mostly based on the Ewald summation method [6]. The DL_MESO_DPD  code includes Standard Ewald and Smooth Particle Mesh Ewald (SPME) methods (in version 2.7, released in December 2018). Accordingly, here the same methods are implemented for the single-GPU version of the code. Continue reading…

Share

CTMQC, a module for excited-state nonadiabatic dynamics

 

CTMQC is a module for excited-state nonadiabatic dynamics. It is used to simulate the coupled dynamics of electrons and nuclei (ideally in gas phase molecular systems) in response to, for instance, an initial electronic excitation.

The CTMQC module is based on the coupled-trajectory mixed quantum-classical (CT-MQC) algorithm [1,2] that has been derived starting from the evolution equations in the framework the exact factorization of the electron-nuclear wavefunction [3,4,5]. The CTMQC algorithm belongs to the family of quantum-classical methods, as the time evolution of the nuclear degrees of freedom is treated within the classical approximation, whereas electronic dynamics is treated fully quantum mechanically. Basically, the nuclei evolve as point particles, following classical trajectories, while the electrons generate the potential inducing such time evolution.

In its current implementation (used in Refs. [6,7]), the module cannot deal with arbitrary nuclear dimensions, but it is restricted to treat up to 3-dimensional problems, which gives the possibility to compare quantum-classical results easily and directly with quantum wavepacket dynamics. CTMQC has been analyzed and benchmarked against exact propagation results on typical low-dimensional model systems [1,2,6,7], and applied for the simulation of the photo-initiated ring-opening process of Oxirane [8]. For this study, CTMQC has been implemented in a developer version of the CPMD electronic structure package based on time-dependent density functional theory. Concerning electronic input properties, the CTMQC module requires a grid representation of the adiabatic potential energy surfaces and of the nonadiabatic coupling vectors, since the electronic dynamics is represented and solved in the adiabatic basis.

This feature allows the algorithm to be easily adaptable, in the current form, to any quantum chemistry electronic structure package. The number of electronic states to be included is not limited and can be specified as input.

Practical application and exploitation of the code
The purpose of the module is to familiarize the user with a new simulation technique, i.e., the CTMQC method, for treating problems where electronic excited states are populated during the molecular dynamics. Photo-activated ultrafast processes are typical situations in which an approach like CTMQC can be used to predict molecular properties, like structures, quantum yields, or quantum coherence.
 
The module is designed to apply the CTMQC procedure to one-, two-, and three-dimensional model systems where an arbitrary number of electronic states are coupled via the nuclear dynamics. Tully model systems [9] are within the class of problems that can be treated by the module, as well as a wide class of multidimensional problems involving, for instance, ultrafast radiationless relaxation of photo-excited molecules [10] through conical intersections.

 

Software documentation can be found in our E-CAM software Library here.
 

 

[1] S. K. Min, F. Agostini, E. K. U. Gross Coupled-trajectory quantum-classical approach to electronic decoherence in nonadiabatic processes Phys. Rev. Lett. 115 (2015) 073001
[2] F. Agostini, S. K. Min, A. Abedi, E. K. U. Gross Quantum-classical nonadiabatic dynamics: Coupled- vs independent-trajectory methods J. Chem. Theory Comput. 12 (2016) 2127
[3] A. Abedi, N. T. Maitra, E. K. U. Gross Exact factorization of the time-dependent electron-nuclear wave function Phys. Rev. Lett. 105 (2010) 123002
[4] A. Abedi, F. Agostini, Y. Suzuki, E. K. U. Gross Dynamical steps that bridge piecewise adiabatic shapes in the exact time-dependent potential energy surface Phys. Rev. Lett. 110 (2013) 263001
[5] F. Agostini, B. F. E. Curchod, R. Vuilleumier, I. Tavernelli, E. K. U. Gross, TDDFT and Quantum-Classical Dynamics: A Universal Tool Describing the Dynamics of Matter Springer International Publishing (2018) 1
[7] G. H. Gossel, F. Agostini, N. T. Maitra Coupled-trajectory mixed quantum-classical algorithm: A deconstruction J. Chem. Theory Comput. 14 (2018) 4513
[8] S. K. Min, F. Agostini, I. Tavernelli, E. K. U. Gross Ab initio nonadiabatic dynamics with coupled trajectories: A rigorous approach to quantum (de)coherence J. Phys. Chem. Lett. 8 (2017) 3048
[9] J. C. Tully Molecular dynamics with electronic transitions J. Chem. Phys. 93 (1990) 1061
[10] B. F. E. Curchod, F. Agostini On the dynamics through a conical intersection J. Phys. Chem. Lett. 8 (2017) 831
 

 

Share

SCDM_WFs

 
Module SCDM_WFs implements the selected columns of the density matrix (SCDM) method [1] for building localized Wannier Functions (WFs). Wannier90 [2] is a post-processing tool for the computation of the Maximally Localised Wannier Functions (MLWFs) [3,4,5], which have been increasingly adopted by the electronic structure community for different purposes. The reasons are manifold: MLWFs provide an insightful chemical analysis of the nature of bonding, and its evolution during, say, a chemical reaction. They play for solids a role similar to localized orbitals in molecular systems. In the condensed matter community, they are used in the construction of model Hamiltonians for, e.g., correlated-electron and magnetic systems. Also, they are pivotal in first-principles tight-binding Hamiltonians, where chemically-accurate Hamiltonians are constructed directly on the Wannier basis, rather than fitted or inferred from macroscopic considerations, and many other applications, e.g. dielectric response and polarization in materials, ballistic transport, analysis of phonons, photonic crystals, cold atom lattices, and the local dielectric responses of insulators, for reference see [3]. This module is a first step towards the automation of MLWFs. In the original Wannier90 framework, automation of MLWFs is hindered by the difficult step of choosing a set of initial localized functions with the correct symmetries and centers to use as an initial guess for the optimization. As a result, high throughput calculations (HTC) and big data analysis with MLWFs have proved to be problematic to implement.

This module is part of the newly developed Wannier90 utilities within the pilot project on Electronic Structure Functionalities for Multi-Thread Workflows. The module is part of the pw2wannier interface between the popular QUANTUM ESPRESSO code link and Wannier90. It will be part of the next version of QUANTUM ESPRESSO v.6.3 and Wannier90. Moreover, it has been successfully added in a developer branch of the AiiDA workflow [6] to perform HTC on large material datasets.

Practical application and exploitation of the code

The SCDM-k method [1] removes the need for an initial guess altogether by using information contained in the single-particle density matrix. In fact, the columns of the density matrix are localized in real space and can be used as a vocabulary to build the localized WFs. The SCDM-k method can be used in isolation to generate well localized WFs. More interestingly is the possibility of coupling the SCDM-k method to Wannier90. The core idea is to use WFs generated by the SCDM-k method as an initial guess in the optimization procedure within Wannier90. This module is a big step towards the automation of WFs and simplification of the use of the Wannier90 program. The module is therefore intended for all the scientists that benefit from the use of WFs in their research. Furthermore, by making the code more accessible and easier to use, this module will certainly increase the popularity of the Wannier90 code.

 
[1] A. Damle, L. Lin, L. Ying SCDM-k: Localized orbitals for solids via selected columns of the density matrix J.Comp.Phys. 334 (2017) 1
[2] A. A. Mostofi, J. R. Yates, Y.-S. Lee, I. Souza, D. Vanderbilt, N. Marzari wannier90: A tool for obtaining maximally-localised Wannier functions Com. Phys. Comm. 178 (2008) 685
[3] N. Marzari, A. A. Mostofi, J. R. Yates, I. Souza, D. Vanderbilt Maximally localized Wannier functions: Theory and applications Rev. Mod. Phys. 84 (2012) 1419
[4] N. Marzari, D. Vanderbilt Maximally localized generalized Wannier functions for composite energy bands Phys. Rev. B 56 (1997) 12847
[5] I. Souza, N. Marzari, D. Vanderbilt Maximally localized Wannier functions for entangled energy bands Phys. Rev. B 65 (2001) 035109
[6] G. Pizzi, A. Cepellotti, R. Sabatini, N. Marzari, B. Kozinsky AiiDA: automated interactive infrastructure and database for computational science Comp. Mat. Sci. 111 (2016) 218

Share

QQ-Interface (Quantics-QChem-Interface)

 
The QQ-Interface module connects the full quantum nonadiabatic wavefunction propagation code Quantics to the time-dependent density functional theory (TDDFT) module of the electronic structure program Q-Chem. Q-Chem provides analytic gradients, Hessians and derivative couplings at TDDFT level. With this module, it is possible to use the Q-Chem TDDFT module for excited state direct dynamics calculations. Quantics will start Q-Chem calculations whenever needed, prepare the input file from a template and will read the Q-Chem output file. The Q-Chem results are stored in the Quantics database and can be used in dynamics simulations. Due to the modular design of Quantics the TDDFT module of Q-Chem can be used for all dynamics simulations, e.g. direct dynamics variational multi-configurational Gaussian (dd-vMCG) or surface hopping simulations.

This module is part of a set of new functionalities developed for the Quantics program package during the E-CAM Extended Software Development Worksop: Quantum MD held at the University College Dublin.

Practical application and exploitation of the code

The module will be used to examine the nonadiabatic excited state dynamics of small to medium-sized molecules. The TDDFT module of Q-Chem allows treating systems that are too large for efficient multireference, such as CASSCF calculations. Until now photoinduced dynamics simulations of such molecules were only possible using trajectory-based algorithms. With Quantics a full quantum-mechanical description of the nuclear motion is possible.
 

Share