- Jony Castagna (Science and Technology Facilities Council (STFC))
- Donal MacKernan MacKernan (University College Dublin)
This is an NVidia Deep Learning Institute (DLI) course.
This workshop introduces the fundamental concepts of Deep Learning, how it works and how is applied to computer vision tasks. It covers basic knowledge of neural network layers, convolutional neural networks, augmentation and other techniques commonly used.
Prerequisites and Registration
A working knowledge of Python including NUMPY is a required prerequisite for this course.
Registration is obligatory as the number of places is limited to 20. Register at https://www.cecam.org/workshop-details/1023
- Rene Halver
- Stephan Schulz
- Godehard Sutmann
 Julich Supercomputing Centre, Forschungszentrum Julich, Germany
 Interdisciplinary Centre for Advanced Materials Simulation (ICAMS), University of Bochum, Germany
Scalability of parallel applications depends on a number of characteristics, among which is eﬃcient communication, equal distribution of work or eﬃcient data lay-out. Especially for methods based on domain decomposition, as it is standard for, e.g., molecular dynamics, dissipative particle dynamics or particle-in-cell methods, unequal load is to be expected for cases where particles are not distributed homogeneously, diﬀerent costs of interaction calculations are present or heterogeneous architectures are invoked, to name a few. For these scenarios the code has to decide how to redistribute the work among processes according to a work sharing protocol or to dynamically adjust computational domains, to balance the workload.
The seminar will provide an overview about motivation, ideas for various methods and implementations on the level of tensor product decomposition, staggered grids, non-homogeneous mesh decomposition and a recently developed phase ﬁeld approach. An implementation of several methods into the load balancing library ALL, which has been developed in the Centre of Excellence E-CAM, is presented. A use case is shown for the Materials Point Method (MPM), which is an Euler-Lagrange method for materials simulations on the macroscopic level, solving continuous materials equations.
The seminar is organised in three main parts:
- Overview of Load Balancing
- The ALL Load Balancing Library
- Balancing the Materials Point Method with ALL
The event will start at 14:00 CET on the 11th December 2020, and is expected to last 2h. It will run online via Zoom for registered participants, and it will be live streamed via YouTube at https://youtu.be/-LCDEnYoFiQ.
- Jony Castagna
- Michael Seaton
- Silvia Chiacchiera
- Leon Petit
STFC Daresbury Laboratory, United Kingdom
Mesoscale simulations have grown recently in importance due to their capacity of capturing molecular and atomistic effects without having to solve for a prohibitively large number of particles needed in Molecular Dynamic (MD) simulations. Different approaches, emerging from a coarse approximation to a group of atoms and molecules, allow reproducing both chemical and physical main properties as well as continuum behaviour such as the hydrodynamics of fluid flows.
One of the most common techniques is the Dissipative Particle Dynamics (DPD): an approximate, coarse-grain, mesoscale simulation method for investigating phenomena between the atomic and the continuum scale world, like flows through complex geometries, micro fluids, phase behaviours and polymer processing. It consists of an off-lattice, discrete particle method similar to MD but with replacement of a soft potential for the conservative force, a random force to simulate the Brownian motion of the particles and a drag force to balance the random force and conserve the total momentum of the system.
However, real applications usually consist of a large number of particles and despite the coarse grain approximation, compared to MD, High Performance Computing (HPC) is often required for simulating systems of industrial and scientific interest. On the other hand, today’s hardware is quickly moving towards hybrid CPU-GPU architectures. In fact, five of the top ten supercomputer are made of mixed CPU and NVidia GPU accelerators which allow to achieve hundreds of PetaFlops performance. This type of architecture is also one of the main paths toward Exascale.
Few software, like DL_MESO, userMESO and LAMMPS, can currently simulate large DPD simulations. In particular, DL_MESO has recently been ported to multi-GPU architectures and runs efficiently up to 4096 GPUs. This allows investigating very large system with billions of particles within affordable computational effort. However, additional effort is required to enable the current version to cover more complex physics, like long range forces as well as achieving higher parallel computing efficiency.
The purpose of this Extended Software Development Workshop (ESDW) is to introduce students to the parallel programming of hybrid CPU-GPU systems. The intention is not only to port mesoscale solvers on GPUs, but also to expose the community to this new programming paradigm, which they can benefit from in their own fields of research. See more details about the topics covered by this ESDW under tab “Program”.
- Seaton M.A. et al. “DL_MESO: highly scalable mesoscale simulations” Molecular Simulation (39) 2013
- Castagna J. et al “Towards Extreme Scale using Multiple GPGPUs in Dissipative Particle Dynamics Simulations”, The Royal Society poster session on Numerical algorithm for high performance computational science” (2019)
- Castagna J. et al. “Towards extreme scale dissipative particle dynamics simulations using multiple GPGPUs”, Computer Physics Communications (2020) 107159, https://doi.org/10.1016/j.cpc.2020.107159
- Alan O’Cais (Jülich Supercomputing Centre)
- David Swenson (École normale supérieure de Lyon)
Online event. More information and registration available through the CECAM website for the event at https://www.cecam.org/workshop-details/1022
Dask is a powerful Python tool for task-based computing. The Dask library was originally developed to provide parallel and out-of-core versions of common data analysis routines from data analysis packages such as NumPy and Pandas. However, the flexibility and usefulness of the underlying scheduler has led to extensions that enable users to write custom task-based algorithms, and to execute those algorithms on high-performance computing (HPC) resources.
This workshop will be a series of virtual seminars/tutorials on tools in the Dask HPC ecosystem. The event will run online via Zoom for registered participants (“participate” tab) and it will be live streamed via YouTube at https://youtube.com/playlist?list=PLmhmpa4C4MzZ2_AUSg7Wod62uVwZdw4Rl.
- 21 January 2021, 3pm CET (2pm UTC): Dask – a flexible library for parallel computing in Python
YouTube link: https://youtu.be/Tl8rO-baKuY
- 4 February 2021, 3pm CET (2pm UTC): Dask-Jobqueue – a library that integrates Dask with standard HPC queuing systems, such as SLURM or PBS
YouTube link: https://youtu.be/iNxhHXzmJ1w
- 11 February 2021, 3pm CET (2pm UTC) : Jobqueue-Features – a library that enables functionality aimed at enhancing scalability
YouTube link: https://youtu.be/FpMua8iJeTk
Registration is not required to attend, but registered participants will receive email reminders before each seminar.
- Donal Mackernan
University College Dublin, Ireland
- Brian Glennon
University College Dublin & SSPC, Ireland
- Erik Santiso
North Carolina State University, USA
- Fernando Luís Barroso da Silva
University of São Paulo, Brazil
This event will have a first part, online, followed by a second face-to-face follow up meeting in the University College Dublin 3-7 months later when health conditions permit.
The first part of the workshop will take place on Feb 25 (Thurs), Mar 2 (Tues), Mar 23 (Tues), and Mar 25 (Thurs) 2021 starting at 3 PM UCT/Dublin/London running daily for 3 hours approximately.
Each afternoon session will be a mixture of overview talks, shorter specialised talks, and discussion. The event will run online via Zoom for registered participants.
Day 1 (3 hours)
- Open systems and rare event methods
- Industry Challenges
Day 2 (3 hours)
- State of the Art
- Practical Solutions
Day 3 (3 hours)
- State of the Art & New Approaches
- Simulation engines and sampling software libraries and Scaling Considerations on Massively Parallel Machines
Day 4 (3 hours)
- Plan for next (Face to Face) meeting and work to be done during the intermission
- Outline of workshop highlights so far
The thermodynamic constraints which best reflect the conditions of many experiments and industrial processing correspond to fixed chemical potentials, pressure and temperature where particle number can fluctuate (from a statistical perspective this is known as the “Grand Canonical Ensemble”), rather than fixed particle number, pressure and temperature, yet most simulation methods in the condensed phase enforce the latter. For instance, many activated processes of relevance to the chemical and pharmaceutical industries occur at constant concentration, e.g. crystallization usually happens at constant supersaturation (Liu et al. 2018, Perego et al. 2015); catalysis usually involves porous materials, where, if diffusion is fast compared to reaction, the appropriate ensemble is the grand canonical. Chemical reactions in solution usually happen at a given concentration of reactants, and many biological processes, to be understood properly, need to be modeled at constant-pH (CpH) (Barroso da Silva and Dias, 2017; Barroso Da Silva and Jönsson, 2009; de Vos et al., 2010; Jönsson et al., 2007; Kirkwood and Shumaker, 1952). The binding free energy of proteins onto nano surfaces such as titanium dioxide, can depend on the local charge on the surface due to the binding, for example, of hydroxyl ions, which in turn depends on the relative concentrations of hydroxyl ions in water and cannot be fully described in systems where particle number cannot fluctuate. Similarly, the binding of antibodies to antigens or nanocarriers (in the context of drug delivery systems) strongly depends on electrostatic effects (Gunner and Baker, 2016; Han et al., 2010; Ivanov et al., 2017; Li et al., 2015; Poveda-Cuevas et al., 2018). These sorts of effects are important for nano-toxicology, food processing, immuno-diagnostics, and drug delivery. Their study through simulation is further complicated when large free energy barriers exist between key metatable states corresponding, for example, to bound and unbound configurations configurations of a ligand to a binding site; or a crystal phase and an amorphous phase; or folded and unfolded protein;or different charge configuration of titratable sites of pH-sensitive proteins in solution (Barroso da Silva and MacKernan, 2017; Barroso da Silva et al., 2019, p. 6; Barroso Da Silva and Jönsson, 2009; Jönsson et al., 2007).
In recent years there have been many developments on methods to study rare events, but these methods usually rely on biasing molecular dynamics with fixed particle number, and are difficult to adapt to ensembles where numbers of particles fluctuate.
Recently, some emerging approaches have tackled the question of modeling rare events at constant chemical potential, for example using the String Method in Collective Variables in the osmotic ensemble to model crystal nucleation at constant supersaturation, but these approaches are still in their infancy. There is a clear need to further develop these approaches and come up with new ideas to study rare events in open systems. In a similar way, the modeling of pH-related processes has been attracting scientific interests, and nowadays a diversity of CpH methods and protocols are available from DFT molecular dynamics to coarse-grained Monte Carlo simulations (Baptista et al., 2002; Bennett et al., 2013; Srivastava et al., 2017). Indeed a key difficulty for many CpH methods of macromolecules with explicit solvents is the presence of very high free energy barriers existing between different charge configurations…
Given the importance of open systems including CpH to industrial processing, and at the same time, the fundamental questions these pose to rare-event methods, the current proposal envisages a combined E-CAM industry scoping and research workshop.
The objectives of this meeting are: (i) provide industry participants a summary of the state of the art simulation and rare-event methods at fixed chemical potential; (ii) provide academic research scientists a perspective of the key challenges in this context that industry faced. (iii) allow for a fundamental review of the statistical foundations of rare-event methods in the context of fixed chemical potentials; (iv) determine the means by which corresponding simulations in the condensed phase can be practically implemented using or adapting popular community simulation engines such as LAMMPS, Gromacs or NAMD, and free energy software such as PLUMED. The possibility of also implementing such methods for ab-initio molecular dynamics will also be assessed.
[su_button url="https://www.e-cam2020.eu/wp-content/uploads/2021/02/Flyer_E-CAM_STFC_event.pdf" size="14"]Download event flyer[/su_button]
1. State of the Art
Innovative and effective interactions with industrialists are one of the pillars of the E-CAM project. In the original E-CAM proposal, two main vehicles to promote these interactions were indicated: collaborative pilot projects matching E-CAM funded human resources with investigative research and software developments directly connected with an industrial partner’s need, and scoping workshops. The latter were to combine presentation of simulation and modelling in areas directly connected to E-CAM’s broad expertise, with open discussion sessions and workgroups involving E-CAM’s participants and industrial partners to identify new collaborative activities and directions for software development of industrial interest. In order to increase industrial involvement in these workshops, industrial researchers have also been involved as co-organisers in meetings with a particular focus of industrial interest.
To further expand the portfolio of activities targeted at industrialists, E-CAM has established a series of new events targeted at training interested industrial researchers on the simulation and modelling techniques implemented in specific codes and in the direct use of this software for their industrial applications. Preliminary discussions within the consortium focussed interest on codes that have played a flagship role in the project and that already have notable industrial interest or are perceived to have significant potential in this domain. Codes exploited by small software or service vendor companies in simulation and modelling are also of particular interest, in view of the additional bonus to foster collaborations with these SMEs, another major target of E-CAM’s industrial strategy.
This application details the first proposal for new E-CAM industrial training events, focussing on the area of meso- and multiscale simulations (Workpackage 4) and on the flagship code DL_MESO.
2. Event Description
In this workshop we will introduce DL_MESO: a software package for mesoscale simulations based on the Dissipative Particle Dynamics (DPD) and Lattice Boltzmann Equation methodologies. The intention is to gradually present the usage of the software, starting with tutorials based on theoretical background and following up with hands-on sessions. We will focus on the DPD methodology, exploring the different capabilities of the DPD code in DL_MESO (DL_MESO_DPD) in order of growing complexity via practical examples that reflect daily industrial challenges: moving from simple soft repulsive (Groot-Warren) interactions to systems with electrostatic potentials. Particular attention will be paid to the problem of parametrization and how to obtain the best results, as well as interpreting simulation outputs.
Following the current growing usage of General-Purpose Graphic Processing Units (hereafter GPUs) as computing accelerators, we will introduce the GPU version of DL_MESO to speed up your applications. This is a rewritten version of the DPD code in the CUDA language to enable the best possible performance on NVidia GPU cards. However, users will not need to code or modify any sections of DL_MESO_DPD as this GPU version is fully transparent and compatible with the master version, which is designed for use with standard computing hardware.
The participants will be able to run their simulations on the Hartree Centre supercomputer GPU nodes and considerably reduce the computing time as well as increasing the problem system size. This will allow participants to move towards real industrial applications, where the number of particles and computational costs are usually prohibitive on a common laptop.
The one-day GPU section will introduce the NVidia GPU hardware and the different market options with pros and cons for the different products, which will enable users to get the best choice for their industrial scenario and find the ideal trade-off between cost and productivity. Moreover, it will focus on the setup of the GPU software environment to allow the DL_MESO_DPD solver to run on accelerators as well as the current limitations of the GPU version.
3. Industrial use cases for DL_MESO DPD
The Dissipative Particle Dynamics (DPD) code in DL_MESO (DL_MESO_DPD) has been used for a wide range of problems of both scientific and industrial interest: to date, more than 120 journal articles have cited the article describing DL_MESO . Within UKRI STFC itself, DL_MESO_DPD was the “simulation engine” for the Computer Aided Formulation (CAF) project . This was a £1 million Technology Strategy Board project involving three industrial partners – Unilever, Syngenta and Infineum – to develop DPD parameterisation strategies and simulation protocols to predict important properties of newly-devised surfactant-based formulations, e.g. alkyl sulphates used in detergents . The direct outputs from this project included additional functionalities being implemented in DL_MESO_DPD, a DPD parameterisation scheme and a corresponding set of interaction parameters based on matching water/octanol partition coefficients , a method to calculate critical micelle concentrations from DPD simulations  and a new particle simulation analysis toolkit, UMMAP .
Further projects based on the work completed for the CAF project have subsequently been carried out by and/or with the STFC Hartree Centre and IBM Research Europe, all using DL_MESO_DPD, UMMAP and other in-house tools. These projects include devising more efficient parameterisation techniques using machine learning , studying other types of surfactants (e.g. alkyl ethoxylates , poly(ethylene oxide) alkyl ethers ) and their adsorption onto chemically heterogeneous surfaces , characterising worm-like and branched micelles  and devising a DPD model for alkanes that can incorporate solidification effects (i.e. wax formation) . An STFC spinout venture company, Formeric , has also been formed to help industrial users to study their own formulated projects, primarily by developing a software platform to make it easier for them to access DPD simulations and modelling tools.
|Day 1||Day 2 part (1)||Day 2 part (2)||Day 3||Day 4 (optional)|
|Introduction DPD and DL_MESO||DPD Parametrisation strategies||Electrostatics and surfaces||Accelerating your simulation with DL_MESO on GPU||Set up your own simulations|
Day 1, Monday 1st March
Introduction to DPD and DL_MESO
09:00 – 11:00 Background and theory
11:00 – 11:30 Break
11:30 – 12:30 Applications
12:30 – 13:30 Break
13:30 – 15:30 Introduction to DL_MESO and DL_MESO_DPD
15:30 – 16:00 Break
16:00 – 17:00 Hands-on session: access/compile DL_MESO_DPD and try running a few test cases
Day 2, Tuesday 2nd March
Part (1): DPD parametrisation strategies
09:00 – 09:30 Background and theory
09:30 – 10:30 Interaction parameters
10:30 – 11:00 Break
11:00 – 12:00 Matching to experimentally-determined properties
12:00 – 12:45 Hands-on session
Part (2): Electrostatics and surfaces
14:00 – 14:45 Strategies to include charges with DPD particles
14:45 – 15:45 Incorporating charge polarisation effects
15:45 – 16:15 Break
16:15 – 17:15 Surfaces, frozen particle walls and moving boundaries
17:15 – 18:00 Hands-on session
Day 3, Wednesday 3rd March
Accelerating your simulation with DL_MESO on GPU
09:00 – 10:00 Introduction to the GPU version of DL_MESO_DPD
10:00 – 10:30 Break
10:30 – 12:30 Hands-on session: Compile DL_MESO_DPD with CUDA language
12:30 – 13:30 Break
13:30 – 15:30 Hands-on session: try out larger-scale simulations (e.g. parameterisation using partition coefficients)
Day 4, Thursday 4th March
Setting up your own simulations
09:00 – 12:30 Hands-on: getting started on parametrising and running DPD simulations of participants’ own systems
All listed times are in GMT
5. Organizers biography
Dr Jony Castagna studied at the University of Calabria “Unical” (Italy) and obtained his PhD on “Direct Numerical Simulation of Turbulent Flows around Complex Geometries” in 2010 (London). After a post-doctoral experience at the University of Southampton, he worked for a CFD company porting to GPU architectures the main solver PROMPT. In 2016 he joined the STFC-Hartree Centre at Daresbury Laboratory and is now part of the High Performance Software Engineering group. He ported the DL_MESO on multi-GPUs under the E-CAM project and several other scientific applications in collaboration with main industrial partners. Jony is an NVidia Ambassador for the Deep Learning Institute since 2018 and actively give courses on CUDA, OpenACC and Introduction to Deep Learning. His main research activity is in Turbulent flow simulations, HPC for hybrid CPU-GPU programming and Neural Network for CFD.
Dr Michael Seaton studied Chemical Engineering at the University of Manchester (previously UMIST), obtaining his EngD in 2008 on modelling acoustic fields through heterogeneous media using the mesoscopic lattice Boltzmann equation (LBE) technique. He joined the Scientific Computing Department at UKRI STFC in 2009 and has since led the DL_MESO project as the principal author and maintainer of its general-purpose mesoscale modelling codes, providing code and simulation support for the UK Collaborative Computing Project CCP5 and the EPSRC High-End Computing consortium UKCOMES. Michael has contributed to projects of industrial and technical interest, including the Innovate UK project on Computer Aided Formulation based on property prediction using Dissipative Particle Dynamics (DPD), the Horizon 2020 E-CAM WP4 pilot project on polarizable mesoscopic water models, and code porting efforts to Intel Xeon Phi co-processors for the Intel Parallel Computing Centre at STFC Hartree Centre. He currently leads metadata and ontology development efforts for the Horizon 2020 Virtual Materials Marketplace (VIMMP) project. Michael has extensive experience in development and optimization of software for high-performance computing (HPC), with interests and expertise in mathematical algorithms and applications of mesoscale modelling techniques.
- MA Seaton, RL Anderson, S Metz and W Smith, DL_MESO: highly scalable mesoscale simulations, Mol Simul 39 (10), 796–821 (2013).
- R Anderson, “Accelerating Formulated Product Design by Computer Aided Approaches”, STFC SCD website (2017): https://www.scd.stfc.ac.uk/Pages/Accelerating-Formulated-Product-Design-by-Computer-Aided-Approaches.aspx
- RL Anderson, DJ Bray, A Del Regno, MA Seaton, AS Ferrante and PB Warren, Micelle formation in alkyl sulfate surfactants using dissipative particle dynamics, J Chem Theory Comput 14 (5), 2633–2643 (2018).
- RL Anderson, DJ Bray, AS Ferrante, MG Noro, IP Stott and PB Warren, Dissipative particle dynamics: systematic parametrization using water-octanol partition coefficients, J Chem Phys 147, 094503 (2017).
- MA Johnston, WC Swope, KE Jordan, PB Warren, MG Noro, DJ Bray and RL Anderson, Toward a standard protocol for micelle simulation, J Phys Chem B 120 (26), 6337–6351 (2016).
- DJ Bray, A Del Regno and RL Anderson, UMMAP: a statistical analysis software package for molecular modelling, Mol Simul 46 (4), 308–322 (2020).
- JL McDonagh, A Shkurti, DJ Bray, RL Anderson and EO Pyzer-Knapp, Utilizing machine learning for efficient parameterization of coarse grained molecular force fields, J Chem Inf Model 59 (10), 4278–4288 (2019)
- E Lavagnini, JL Cook, PB Warren, MJ Williamson and CA Hunter, A surface site interaction point method for dissipative particle dynamics parametrization: application to alkyl ethoxylate surfactant self-assembly, J Phys Chem B 124 (24), 5047–5055 (2020).
- MA Johnston, AI Duff, RL Anderson and WC Swope, Model for the simulation of the CnEm nonionic surfactant family derived from recent experimental results, J Phys Chem B 124 (43), 9701–9721 (2020).
- J Klebes, S Finnigan, DJ Bray, RL Anderson, WC Swope, MA Johnston and B O Conchuir, The roles of chemical heterogeneity in surfactant adsorption at solid-liquid interfaces, J Chem Theory Comput 16 (11), 7135 – 7147 (2020).
- B O Conchuir, K Gardner, KE Jordan, DJ Bray, RL Anderson, MA Johnston, WC Swope, A Harrison, DR Sheehy and TJ Peters, Efficient algorithm for the topological characterization of worm-like and branched micelle structures from simulations, J Chem Theory Comput 16 (7), 4588–4598 (2020).
- DJ Bray, RL Anderson, PB Warren and K Lewtas, Wax formation in linear and branched alkanes with dissipative particle dynamics, J Chem Theory Comput 16 (11), 7109–7122 (2020).
- “Formeric: Accessible Computer Aided Formulation”, website: https://formeric.co.uk
- Andrea Cavalli (Italian Institute of Technology)
- Sergio Decherchi (Italian Institute of Technology)
- Marco Ferrarotti (Istituto Italiano di Tecnologia)
- Walter Rocchia (Istituto Italiano di Tecnologia)
Methods for simulating complex phenomena are increasingly becoming an accepted mean of pursuing scientific discovery. Especially in molecular sciences, simulation methods have a key role. They allow to perform virtual experiments and to estimate observables of interest for a wide range of complex phenomena such as proteins conformational changes, reactions and protein-ligand binding processes. Means to achieve these goals are continuum modeling (e.g. Poisson-Boltzmann equation1), meso-scale methods2 or more accurate full atomistic simulations either classical3 or at quantum level of theory4. All these techniques have the common requirements of solving complex equations that call for adequate computing resources.
Modern computing units are inherently parallel machines where multiple multi-core CPUs often paired with one or many accelerators such as GPUs or, more recently, FPGA devices. In this context, High performance computing (HPC) is the computer science discipline that specifically addresses the task of optimizing the performance of software through code refactoring, single/multi thread/process optimization. Despite several excellent codes already exist, the requirement of properly accelerating simulative codes is still compelling with several engines not leveraging the actual capabilities of current architectures.
The aim of this E-CAM Extended Software Development Workshop (ESDW) is to introduce the participants to HPC through frontal lessons on computer architectures, applications and via hands-on sessions where participants will plan a suitable optimization strategy for one of the selected codes and start optimizing its performance.
This ESDW will thus focus on the technological aspects in HPC optimization/parallelization procedures. It will take place in Genoa, town where the Italian Institute of Technology is located. The venue will be hotel Tower Genova Airport Hotel & Conference Center, and the workshop will last one week (5 days ).
The primary goal of the workshop will be to show to the participants which are the main challenges in code optimization and parallelization and the correct balance between code readability, long term maintenance and performance. This will include technical lessons in which parallelization paradigms are explained in detail. A second goal will be to allow participants to find computational bottlenecks within software and to setup an optimization/parallelization strategy including some initial optimization on the selected codes. This activity will be interleaved with talks that present examples of HPC oriented applications through invited speakers.
The lessons will cover:
Modern computing machinery architectures
Code refactoring and single thread optimizations
Shared memory architectures and parallelization
GPU oriented parallelization
We will select two/three codes among the ones proposed by the applicants during the registration procedure. Each applicant can apply either proposing a code or not. Codes selection will be based on the code quality, wide interest for the ECAM Community and the commitment of the proposers to carry on (IIT HPC group will support this activity) a long term optimization project beyond the ECAM workshop timeframe.
We will issue a first call for codes-bringing applicants and for codes-agnostic applicants that will participate irrespectively of the selected codes. Then, once the two/three codes are selected a second call will be issued for applicants only. Applicants proposing a code are supposed to know in detail the software internals, its usage and to have prepared proper testing/benchmarking input files; codes can be in C/C++/Fortran or even in Python for an initial porting. Ideally, the code should be serial in order to plan a full optimization strategy. All the attendees are supposed to bring their laptops to remotely access the IIT Cluster (64 nodes, GPUs equipped) during the hands-on sessions.
(1) Decherchi, S.; Colmenares, J.; Catalano, C. E.; Spagnuolo, M.; Alexov, E.; Rocchia, W. Between Algorithm and Model: Different Molecular Surface Definitions for the Poisson-Boltzmann Based Electrostatic Characterization of Biomolecules in Solution. Commu. Comput. Phys. 2013, 13 (1), 61–89. https://doi.org/10.4208/cicp.050711.111111s.
(2) Succi, S.; Amati, G.; Bonaccorso, F.; Lauricella, M.; Bernaschi, M.; Montessori, A.; Tiribocchi, A. Towards Exascale Design of Soft Mesoscale Materials. Journal of Computational Science 2020, 101175. https://doi.org/10.1016/j.jocs.2020.101175.
(3) Dror, R. O.; Dirks, R. M.; Grossman, J. P.; Xu, H.; Shaw, D. E. Biomolecular Simulation: A Computational Microscope for Molecular Biology. Annu. Rev. Biophys. 2012, 41 (1), 429–452. https://doi.org/10.1146/annurev-biophys-042910-155245.
(4) Car, R.; Parrinello, M. Unified Approach for Molecular Dynamics and Density-Functional Theory. Phys. Rev. Lett. 1985, 55 (22), 2471–2474. https://doi.org/10.1103/PhysRevLett.55.2471.
- Nick R. Papior
Technical University of Denmark, Denmark
- Micael Oliveira
Max Planck Institute for the Structure and Dynamics of Matter, Hamburg, Germany
- Yann Pouillon
Universidad de Cantabria, Spain
- Volker Blum
Duke University, Durham, NC, USA, USA
- Fabiano Corsetti
Synopsys QuantumWise, Denmark
- Emilio Artacho
University of Basque Country, United Kingdom
The landscape of Electronic Structure Calculations is evolving rapidly. On one hand, the adoption of common libraries greatly accelerates the availability of new theoretical developments and can have a significant impact on multiple scientific communities at once [LibXC, PETSc]. On the other hand, electronic-structure codes are increasingly used as “force drivers” within broader calculations [Flos,IPi], a use case for which they have initially not been designed. Recent modelling approaches designed to address limitations with system sizes, while preserving consistency with what is currently available, have also become relevant players in the field. For instance, Second-Principles Density Functional Theory [SPDFT], a systematic approximation built on top of the First-Principles DFT approach, provides a similar level of accuracy to the latter and makes it possible to run calculations on more than 100,000 atoms [ScaleUp, Multibinit]. At a broader level, the European Materials Modelling Council (EMMC) has been organizing various events to establish guidelines and roadmaps around the collaboration of Academy and Industry, to meet prominent challenges in the modelling of realistic systems and the economic sustainability of such endeavours, as well as proposing new career paths for people with hybrid scientific/software engineer profiles [EMMC1,EMMC2].
All these trends further push the development of electronic-structure software more and more towards the provision of standards, libraries, APIs, and flexible software components. At a social level, they are also bringing different communities together and reinforce existing collaborations within the communities themselves. Ongoing efforts include an increasing part of coordination of the developments, enhanced integration of libraries into main codes, and consistent distribution of the software modules. They have been made possible in part by the successful adaptation of Lean, Agile and DevOps approaches to the context of scientific software development and the construction of highly-automated infrastructures [EtsfCI, OctopusCI, SiestaPro]. A key enabler in all this process has been the will to get rid of the former silo mentality, both at a scientific level (one research group, one code) as well as a business model level (libre software vs. open-source vs. proprietary), allowing collaborations between communities and making new public-private partnerships possible.
In this context, an essential component of the Electronic Structure Library [esl, els-gitlab] is the ESL Bundle, a consistent set of libraries broadly used within the Electronic Structure Community that can be installed together. This bundle solves various installation issues for end users and enables a smoother integration of the shipped libraries into external codes. In order to maintain the compatibility of the bundle with the main electronic-structure codes on the long run, its development has been accompanied by the creation of the ESL Steering Committee, which includes representatives of both the individual ESL components and the codes using them. As a consequence, the visibility of the ESL expands and the developers are exposed to an increasing amount of feedback, as well as requests from third-party applications. Since many of these developers are contributing to more than one software package, this constitutes an additional source of pressure, on top of research publications and fundraising duties, that is not trivial to manage.
Establishing an infrastructure that allows code developers to efficiently act upon the feedback received and still guarantee the long-term usability of the ESL components, both individually and as a bundle, has become a necessary step. This requires an efficient coordination between various elements:
Set up a common and consistent code development infrastructure / training in terms of compilation, installation, testing and documentation, that can be used seamlessly beyond the electronic structure community, and learn from solutions adopted by other communities.
Agree on metadata and metrics that are relevant for users of ESL components as well as third-party software, not necessarily related to electronic structure in a direct way.
Creating long-lasting synergies between stakeholders of all communities involved and making it attractive for Industry to contribute.
Since 2014, the ESL has been paving the way towards broader and broader collaborations, with a wiki, a data-exchange standard, refactoring code of global interest into integrated modules, and regularly organising workshops, within a wider movement lead by the European eXtreme Data and Computing Initiative [exdci].