Some useful tips to help moving to online training
E-CAM has built up a collection of (hopefully) useful information to help our community, other Centres of Excellence, and interested groups, transition to online training. The information originates from community-contributed sources and by directly sharing our experience in capturing and broadcasting E-CAM training events. Guides to help with online training are being rapidly created as the CoVid-19 crises evolves, and we try to keep the information here moderated to avoid overwhelming people.
This collection, “Moving to online learning”, is available through E-CAM’s ONLINE TRAINING PORTAL and includes the following items:
- Teaching Online at Short Notice
- Tips for Teaching Online from The Carpentries Community
- More Community-Contributed Tips for Teaching Online
- Capturing and broadcasting training events
- Finding freely usable images
- Tips for safer Zoom meetings
If you know of something that could be of value in this list, please email E-CAM Software Manager Alan O’Cais at a.ocais@fz-juelich.de.
Automated high-throughput Wannierisation, a successful collaboration between E-CAM and the MaX Centre of Excellence
Maximally-localised Wannier functions (MLWFs) are routinely used to compute from first- principles advanced materials properties that require very dense Brillouin zone (BZ) integration and to build accurate tight-binding models for scale-bridging simulations. At the same time, high-thoughput (HT) computational materials design is an emergent field that promises to accelerate the reliable and cost-effective design and optimisation of new materials with target properties. The use of MLWFs in HT workflows has been hampered by the fact that generating MLWFs automatically and robustly without any user intervention and for arbitrary materials is, in general, very challenging. We address this problem directly by proposing a procedure for automatically generating MLWFs for HT frameworks. Our approach is based on the selected columns of the density matrix method (SCDM, see SCDM Wannier Functions) and is implemented in an AiiDA workflow.
Purpose of the module
Create a fully-automated protocol based on the SCDM algorithm for the construction of MLWFs, in which the two free parameters are determined automatically (in our HT approach the dimensionality of the disentangled space is fixed by the total number of states used to generate the pseudopotentials in the DFT calculations).
A paper describing the work is available at https://arxiv.org/abs/1909.00433, where this approach was applied to a dataset of 200 bulk crystalline materials that span a wide structural and chemical space.
Background information
This module is a collaboration between E-CAM and the MaX Centre of Excellence.
In the SCDM Wannier Functions module, E-CAM has implemented the SCDM algorithm in the pw2wannier90.f90 interface code between the Quantum ESPRESSO software and the Wannier90 code. This implementation was used as the basis for a complete computational workflow for obtaining MLWFs and electronic properties based on Wannier interpolation of the BZ, starting only from the specification of the initial crystal structure. The workflow was implemented within the AiiDA materials informatics platform, and used to perform a HT study on a dataset of 200 materials, as described in here.
More information at https://e-cam.readthedocs.io/en/latest/Electronic-Structure-Modules/modules/W90_MaX_collab/readme.html
QMCPack Interfaces for Electronic Structure Computations
Quantum Monte Carlo (QMC) methods are a class of ab initio, stochastic techniques for the study of quantum systems. While QMC simulations are computationally expensive, they have the advantage of being accurate, fully ab initio and scalable to a large number of cores with limited memory requirements.
These features make QMC methods a valuable tool to assess the accuracy of DFT computations, which are widely used in the fields of condensed matter physics, quantum chemistry and material science.
QMCPack is a free package for QMC simulations of electronic structure developed in several national labs in the US. This package is written in object oriented C++, offers a great flexibility in the choice of systems, trial wave functions and QMC methods and supports massive parallelism and the usage of GPUs.
Trial wave functions for electronic QMC computations commonly require the use of single electrons orbitals, typically computed by DFT. The aim of the E-CAM pilot project described here is to build interfaces between QMCPack and other softwares for electronic structure computations, e.g. the DFT code Quantum Espresso.
These interfaces are used to manage the orbital reading or their DFT generation within QMCPack, to establish an automated, black box workflow for QMC computations. QMC simulation can for example be used in the benchmark and validation of DFT calculations: such a procedure can be employed in the study of several physical systems of interest in condensed matter physics, chemistry or material science, with application in the industry, e.g. in the study of metal-ion or water-carbon interfaces.
The following modules have been built as part of this pilot project:
- QMCQEPack, that provides the files to download and properly patch Quantum Espresso 5.3 to build the libpwinterface.so library; this library is required to use the module ESPWSCFInterface to generate single particle orbitals during a QMCPack computation using Quantum Espresso.
- ESInterfaceBase that provides a base class for a general interface to generate single particle orbitals to be used in QMC simulations in QMCPack; implementations of specific interfaces as derived classes of ESInterfaceBase are available as the separate modules as follows:
The documentation about interfaces in QMCPack, can be seen in the QMCPack user manual at https://github.com/michruggeri/qmcpack/blob/f88a419ad1a24c68b2fdc345ad141e05ed0ab178/manual/interfaces.tex
New publication is out: “Towards extreme scale dissipative particle dynamics simulations using multiple GPGPUs”
E-CAM researchers working at the Hartree Centre – Daresbury Laboratory have co-designed the DL_MESO Mesoscale Simulation package to run on multiple GPUs, and ran for the first time a Dissipative Particle Dynamics simulation of a very large system (1.8 billion particles) on 4096 GPUs.
Towards extreme scale dissipative particle dynamics simulations using multiple GPGPUs
J. Castagna, X. Guo, M. Seaton and A. O’Cais
Computer Physics Communications (2020) 107159
DOI: 10.1016/j.cpc.2020.107159 (open access)
Abstract
A multi-GPGPU development for Mesoscale Simulations using the Dissipative Particle Dynamics method is presented. This distributed GPU acceleration development is an extension of the DL_MESO package to MPI+CUDA in order to exploit the computational power of the latest NVIDIA cards on hybrid CPU–GPU architectures. Details about the extensively applicable algorithm implementation and memory coalescing data structures are presented. The key algorithms’ optimizations for the nearest-neighbour list searching of particle pairs for short range forces, exchange of data and overlapping between computation and communications are also given. We have have carried out strong and weak scaling performance analyses with up to 4096 GPUs. A two phase mixture separation test case with 1.8 billion particles has been run on the Piz Daint supercomputer from the Swiss National Supercomputer Center. With CUDA aware MPI, proper GPU affinity, communication and computation overlap optimizations for multi-GPU version, the final optimization results demonstrated more than 94% efficiency for weak scaling and more than 80% efficiency for strong scaling. As far as we know, this is the first report in the literature of DPD simulations being run on this large number of GPUs. The remaining challenges and future work are also discussed at the end of the paper.
6 software modules delivered in the area of Quantum Dynamics
In this report for Deliverable 3.5 of E-CAM [1], 6 software modules in quantum dynamics are presented.
All modules stem from the activities initiated during the State-of-the-Art Workshop held at Lyon (France) in June 2019 and the Extended Software Development Workshop in Quantum Dynamics, held at Durham University (UK) in July 2019. The modules originate from the input of E-CAM’s academic user base. They have been developed by members of the project (S. Bonella – EPFL), established collaborators (G. Worth – University College London, S. Gomez – University of Vienna, C. Sanz – University of Madrid, D. Lauvergnat – Univeristy of Paris Sud) and new contributors to the E-CAM repository (F. Agostini – University of Paris Sud, Basile Curchod – University of Durham, A. Schild – ETH Zurich, S. Hupper and T. Plé – Sorbonne University, G. Christopoulou – University College London). The presence of new contributors indicates the interest of the community in our efforts. Furthermore, the contributors to modules in WP3 continue to be at different stages of their careers (in particular, Thomas Plé and G. Christopoulou are PhD students) highlighting the training value of our activities.
Following the order of presentation, the 6 modules are named: CLstunfti, PIM_QTB, PerGauss, Direct Dynamics Database, Exact Factorization Analysis Code (EFAC), and GuessSOC. In this report, a short description is written for each module, followed by a link to the respective Merge-Request document on the GitLab service of E-CAM. These merge requests contain detailed information about the code development, testing and documentation of the modules.
[1] “D3.5.: Quantum dynamics e-cam modules IV,” Dec. 2019. [Online]. Available: https://doi.org/10.5281/zenodo.3598325
Full report available here.
PANNA: Properties from Artificial Neural Network Architectures
PANNA is a package for training and validating neural networks to represent atomic potentials. It implements configurable all-to-all connected deep neural network architectures which allow for the exploration of training dynamics. Currently it includes tools to enable original[1] and modified[2] Behler-Parrinello input feature vectors, both for molecules and crystals, but the network can also be used in an input-agnostic fashion to enable further experimentation. PANNA is written in Python and relies on TensorFlow as underlying engine.
A common way to use PANNA in its current implementation is to train a neural network in order to estimate the total energy of a molecule or crystal, as a sum of atomic contributions, by learning from the data of reference total energy calculations for similar structures (usually ab-initio calculations).
The neural network models in literature often start from a description of the system of interest in terms of local feature vectors for each atom in the configuration. PANNA provides tools to calculate two versions of the Behler-Parrinello local descriptors but it allows the use of any species-resolved, fixed-size array that describes the input data.
PANNA allows the construction of neural network architectures with different sizes for each of the atomic species in the training set. Currently the allowed architecture is a deep neural network of fully connected layers, starting from the input feature vector and going through one or more hidden layers. The user can determine to train or freeze any layer, s/he can also transfer network parameters between species upon restart.
In summary, PANNA is an easy-to-use interface for obtaining neural network models for atomistic potentials, leveraging the highly optimized TensorFlow infrastructure to provide an efficient and parallelized, GPU-accelerated training.
It provides:
- an input creation tool (atomistic calculation result -> G-vector )
- an input packaging tool for quick processing of TensorFlow ( G-vector -> TFData bundle)
- a network training tool
- a network validation tool
- a LAMMPS plugin
- a bundle of sample data for testing[3]
See the full documentation of PANNA at https://gitlab.com/PANNAdevs/panna/blob/master/doc/PANNA_documentation.md
GitLab repository for PANNA: https://gitlab.com/PANNAdevs/panna
See manuscript at https://arxiv.org/abs/1907.03055