Implementation of neural network potentials for coarse-grained models

Neural network potentials (NNPs) [1] have demonstrated the effectiveness of  machine-learning tools in the context of atomistic simulations. This approach, which is based on artificial neural networks trained to accurately reproduce ab  initio potential energy surfaces, offers two major advantages. First, with  respect to the underlying reference method the computational effort to calculate energies and forces is drastically reduced, which allows to sample large system  sizes and long time scales in molecular dynamics simulations. In addition,  unlike empirical potentials, the neural network at the very heart of the method  is not limited by an approximate functional form but can flexibly adjust to the  reference potential energy surface. Recently, it has been proposed to extend  the machine-learning potential approach to the construction of coarse-grained  models [2]. In this pilot project we apply this methodology to neural network  potentials and develop a software based on the existing package n2p2 . We intend to construct a coarse-grained model for dendrimer-like DNA  molecules [3] in order to demonstrate the implementation.

[1] Behler, J.; Parrinello, M. Generalized Neural-Network Representation of  High-Dimensional Potential-Energy Surfaces. Phys. Rev. Lett. 2007, 98 (14), 146401.

[2] Zhang, L.; Han, J.; Wang, H.; Car, R.; E, W. DeePCG: Constructing  Coarse-Grained Models via Deep Neural Networks. J. Chem. Phys. 2018, 149 (3), 034101.

[3] Jochum, C.; Adžić, N.; Stiakakis, E.; Derrien, T. L.; Luo, D.; Kahl, G.; Likos, C. N. Structure and Stimuli-Responsiveness of All-DNA Dendrimers: Theory  and Experiment. Nanoscale 2019, 11 (4), 1604–1617.

1. Implement Python tools to generate coarse-grained from fully atomistic data sets.
2. Devise and implement a procedure to estimate the effectiveness of the coarse-grained description with atomic environment descriptors.
3. Train and evaluate NNP-CG models for simple (water) and complex (DNA dendrimer) systems.
4. (optional) Improve CG models by inclusion of additional degrees of freedom (e.g. orientation, more particle types).
5. Allow large-scale MD simulations on GPUs via LAMMPS and Kokkos.

Descriptor analysis

Status: Ready (to be merged into E-CAM documentation)

Expected delivery date: September 2020

Description: The overall goal of the analysis is to show qualitatively whether there is a correlation between the raw atomic environment descriptors (and their
derivatives) and the atomic forces. If no or very little correlation can be found we can assume that the descriptors do not encode enough information to
construct a (free) energy landscape. On the other hand, if "similar" descriptors correspond to "similar" forces there is a good chance that a machine learning
algorithm is capable of detecting this link and a machine learning potential can be fitted. In order to find a possible correlation between descriptors and
forces the following approach is used: First, a clustering algorithm (k-means or HDBSCAN) searches for groups in the high-dimensional descriptor space of all
atoms. Then, for every detected cluster the statistical distribution of the corresponding atomic forces is compared to the statistics of all remaining
atomic forces. A hypothesis test (Welch's t-test) is applied to decide whether the link between descriptors and forces is statistically significant. The
percentage of clusters which show a clear link is then an indicator for a good descriptor-force correlation.

Symmetry Function Memory Footprint Reduction

Status: Ready

Expected delivery date: July 2020

Description: This module improves memory management in n2p2 . More specifically, a new strategy to store symmetry function derivatives is implemented. In this way the memory footprint during training is drastically reduced. The idea is to exploit that in a multi-element system for specific combinations of neighboring atoms the symmetry function derivatives always equal zero. Hence, by taking these element combination relations automatically into account a significant portion of the memory usage can be avoided. Depending on the symmetry function setup savings of about 30 to 50% can be achieved for typical systems. This improvement is particularly important for the generation of NNP-CG models because the required training data sets can be very large.

Improved link to HPC software (in particular LAMMPS)

Status: Ready

Expected delivery date: January 2021

Description: This module improves the connection of n2p2 to HPC software, in particular to LAMMPS, by creating a pull request to the official LAMMPS repository. Due to its flexible atom model LAMMPS is well suited to carry out simulations based on coarse-grained models. Furthermore, this module covers improvements regarding the n2p2 build process and testing of a user-contributed interface to CabanaMD. This experimental software can be considered a precursor of an upcoming GPU implementation of NNPs in LAMMPS via the performance portability library Kokkos.

Polynomial symmetry functions

Status: Ready

Expected delivery date: December 2020

Description: The substantial code changes covered by this module introduce a new set of atomic environment descriptors for high-dimensional neural network potentials (HDNNPs) in n2p2. Polynomial symmetry functions are designed to mimic closely the behavior of traditional Behler-Parrinello symmetry functions but with a significantly reduced computational cost. These alternative descriptors may be particularly suited to generate neural network potential based CG models as they allow for a more fine-grained control of the functional form in the angular domain. Many example applications and benchmark results are presented in a recent publication.

Implementation of neural network potentials for coarse-grained models

WP1 Classical MD

Dr. Andreas Singraber

Description

Development Plan

List of Tasks

List of Modules

Descriptor analysis

Symmetry Function Memory Footprint Reduction

Improved link to HPC software (in particular LAMMPS)

Polynomial symmetry functions

Published Results

Outreach Material

Recent Posts

March Module of the Month: n2p2 – Improved link to HPC MD software

Implementation of High-Dimensional Neural Network Potentials