An article about E-CAM has just been released with the Autumn edition of the EU Research Magazine. The EU research magazine is Europe’s leader in research dissemination.
The piece consists on an interview to Prof. Ignacio Pagonabarraga, E-CAM technical manager, Dr. Sara Bonella, leader of our work-package focused on quantum dynamics and also of the work-package that deals with the interactions with industry; Dr. Donal Mackernan, leader of our dissemination work-package and Dr. Jony Castagna, programmer in E-CAM.
The interview describes E-CAM’s work in
(1) developing software targeted at the needs of both academic and industrial end-users, with applications from drug development to the design of new materials ;
(2) tuning those codes to run on HPC machines, through application co-design and the provision of HPC oriented libraries and services;
(3) training scientists from industry and academia ; and
(4) supporting industrial end-users in their use of simulation and modelling, via workshops and direct discussions with experts in the CECAM community.
The need to find easily renewable and environmentally friendly energy sources alternative to the traditional fossil fuels is nowadays a global quest. The solar energy is a promising candidate and organic solar cells (OSCs) have attracted attention. In this collaboration with Merck, E-CAM scientists have used electronic structure calculations to study how a key magnitude – the HOMO-LUMO band gap – changes with respect to the molecular disposition of the donor-acceptor molecule pair.
Traditionally high-throughput computing (HTC) workloads are looked down upon in the HPC space, however the scientific use case for extreme-scale resources required by coordinated HTC workflows exists. For such cases where there may be thousands of tasks each requiring peta-scale computing, E-CAM has extended the data-analytics framework Dask with a capable and efficient library to handle such workloads.
Introduction
The initial motivation for E-CAM’s High Throughput Library, jobqueue_features library [1], is driven by the ensemble-type calculations that are required in many scientific fields, and in particular in the materials science domain. A concrete example is the study of molecular dynamics with atomistic detail, where timesteps must be used on the order of a femto-second. Many problems in biological chemistry and materials science involve events that only spontaneously occur after a millisecond or longer (for example, biomolecular conformational changes). That means that around 1012 time steps would be needed to see a single millisecond-scale event. This is the problem of “rare events” in theoretical and computational chemistry.
Modern supercomputers are beginning to make it possible to obtain trajectories long enough to observe some of these processes, but to fully characterize a transition with proper statistics, many examples are needed. In such cases the same peta-scale application must be run many thousands of times with varying inputs. For this use case, we were conceptually attracted to the Dask philosophy [2]: Dask is a specification that encodes task schedules with minimal incidental complexity using terms common to all Python projects, namely dicts, tuples, and callables.
However, Dask or it’s extensions do not currently support task-level parallelization (in particular multi-node tasks). We have been able to leverage the Dask extension dask_jobqueue [3] and build upon it’s functionality to include support for MPI-enabled task workloads on HPC systems. The resulting approach, described in the rest of this piece, allows for multi-level parallelization (at the task level via MPI, and at the framework level via Dask) while leveraging all of the pre-existing effort within the Dask framework such as scheduling, resilience, data management and resource scaling.
E-CAM’s HTC library was created in collaboration with a PRACE team in Wrocław, and is the subject of an associated white paper [4]. This effort is under continuous improvement and development. A series of dedicated webinars will happen in the fall of 2020, which will be an opportunity for people to learn how to use Dask and dask_jobqueue (to submit Dask workloads on a resource scheduler like SLURM), and to implement our library jobqueue_features in their codes. Announcement and more information will soon be available at https://www.e-cam2020.eu/calendar/.
Methodology
The jobqueue features library [1] is an extension of dask_jobqueue [3] which in turn utilizes the Dask [2] data analytics framework. dask_jobqueue is targeted at deploying Dask on several job queuing systems, such as SLURM or PBS with the use of a Python programming interface. The main enhancements of basic dask_jobqueue functionality is heavily extending the configuration implementation to handle MPI runtimes and different resource specifications. This allows the end-user to conveniently create parallelized tasks without extensive knowledge of the implementation details (e.g., the resource manager or MPI runtime). The library is primarily accessed through a set of Python decorators: on_cluster, task and mpi_task. The on_cluster decorator gets or creates clusters, which in turn submit worker resource allocation requests to the scheduler to execute tasks. The mpi_task decorator derives from task and enhances it with MPI specific settings (e.g. the MPI runtime and related settings).
In Fig. 1 we show a minimal, but complete, example which uses the mpi_task and on_cluster decorators for a LAMMPS execution. The configuration, communication and serialization is isolated and hidden from user code.
Any call to my_lammps_job results in the lammps_task function being executed remotely by a lammps_cluster worker allocated by the resource manager with 2 nodes and 12 MPI tasks per node. The code can be executed interactively in a Jupyter notebook. To overlap calculations one would need to return the t1 future rather than the actual result.
Findings
The library can effectively handle simultaneous workloads on GPU, KNL and CPU partitions of the JURECA supercomputer [5]. The caveat with respect to the hardware environment is that you need to be able to have a network that supports TCP (usually via IPoIB) or UCX connections between the scheduler and the workers (which process and execute the tasks that are queued).
With respect to the software stack, this is an issue highlighted by the KNL booster of JURECA. On the booster, there is a different micro-architecture and it is required to completely change your software stack to support this. The design of the software stack implementation on JURECA simplifies this but ensuring your tasks are run in the correct software environment is one of the more difficult things to get right in the library. As a result, the configuration of the clusters (which define the template required to submit workers to the appropriate queue of the resource manager) can be quite non-trivial. However, they can be located within a single file which will need to be tuned for the available resources. With respect to the tasks themselves, no tuning is necessarily required.
We see ∼90% throughput efficiency for trivial tasks, if the tasks executed for any reasonable length of time this throughout efficiency would be much higher.
Conclusions
The library is flexible, scalable, efficient and adaptive. It is capable of simultaneously utilising CPUs, KNL and GPUs (or any other hardware) and dynamically adjusting its use of these resources based on the resource requirements of the scheduled task workload. The ultimate scalability and hardware capabilities of the solution is dictated by the characteristics of the tasks themselves with respect to these. For example, for the use case described here these would mean the hardware and scalability capabilities of LAMMMPS with a further multiplicative factor coming from the library for the number of tasks running simultaneously. There is, unsurprisingly, room for further improvement and development, in particular related to error handling and limitations related to the Python GIL.
In a recent paper[1], researchers from the Centres of Excellence E-CAM[2] and MaX[3], and the centre for Computational Design and Discovery of Novel Materials NCCR MARVEL[4], have proposed a new procedure for automatically generating Maximally-Localised Wannier functions (MLWFs) for high-throughput frameworks. The methodology and associated software can be used for hitherto difficult cases of entangled bands, and allows the electronic properties of a wide variety of materials to be obtained starting only from the specification of the initial crystal structure, including insulators, semiconductors and metals. Industrial applications that this work will facilitate include the development of novel superconductors, multiferroics, topological insulators, as well as more traditional electronic applications.
Challenge/context
Predicting the properties of complex materials generally entails the use of methods that facilitate coarse grained perspectives more suitable for large scale modelling, and ultimately device design and manufacture. When a quantum level of description of a modular-like system is required, this can often be facilitated by expressing the Hamiltonian in terms of a localised, real-space basis set, enabling it to be partitioned without ambiguity into sub-matrices that correspond to the individual subsystems. Maximally-localised Wannier functions (MLWFs) are particularly suitable in this context. However, until now generating MLWFs has been difficult to exploit in high-throughput design of materials, without the specification by users of a set of initial guesses for the MLWFs, typically trial functions localised in real space, based on their experience and chemical intuition.
Solution
E-CAM[2] scientist Valerio Vitale and co-authors from the partner H2020 Centre of Excellence MAX[3] and the Swiss based NCCR MARVEL [4] in a recent article[1] look afresh at this problem in the context of an algorithm by Damle et al[5], known as the selected columns of the density matrix (SCDM) method, as a method to provide automatically initial guesses for the MLWF search, to compute a set of localized orbitals associated with the Kohn–Sham subspace for insulating systems. This has shown great promise in avoiding the need for user intervention in obtaining MLWFs and is robust, being based on standard linear-algebra routines rather than on iterative minimisation. In particular, Vitale et al. developed a fully-automated protocol based on the SCDM algorithm in which the three remaining free parameters (two from the SCDM method, plus the choice of the target dimensionality for the disentangled subspace) are determined automatically, making it thus parameter-free even in the case of entangled bands. The work systematically compares the accuracy and ease of use of standard methods to generate localised basis sets as (a) MLWFs; (b) MLWFs combined with SCDM’s and (c) using solely SCDM’s; and applies this multifaceted perspective to hundreds of materials including insulators, semiconductors and metals.
Benefit
This is significant because it greatly expands the scope of materials for which MLWFs can be generated in high throughput studies and has the potential to accelerate the design and discovery of materials with tailored properties using first-principles high-throughput (HT) calculations, and facilitate advanced industrial applications. Industrial applications that this work will facilitate include the development of novel superconductors, multiferroics, topological insulators, as well as more traditional electronic applications.
Background information
This module is a collaboration between the E-CAM and MaX HPC centres of excellence, and the NCCR MARVEL.
In SCDM Wannier Functions, E-CAM has implemented the SCDM algorithm in the pw2wannier90 interface code between the Quantum ESPRESSO software and the Wannier90 code. This was done in the context of an E-CAM pilot project at the University of Cambridge. Researchers have then used this implementation as the basis for a complete computational workflow for obtaining MLWFs and electronic properties based on Wannier interpolation of the Brillouin zone, starting only from the specification of the initial crystal structure. The workflow was implemented within the AiiDA materials informatics platform (from the NCCR MARVEL and the MaX CoE) , and used to perform a HT study on a dataset of 200 materials.
Source Code
See the Materials Cloud Archive entry. A downloadable virtual machine is provided that allows to reproduce the results of the associated paper and also to run new calculations for different materials, including all first-principles and atomistic simulations and the computational workflows.
Particularly active in applying atomistic and coarse-grained simulations to study the interaction of nano-objects and surfactants with lipid bilayers for industrial applications (e.g. soaps, detergents, etc.), Massimo Noro has made considerable contributions to the development and application of the Dissipative Particle Dynamics (DPD) simulation technique to study soft condensed matter systems.
Former science leader of the High Performance Computing division at Unilever and current Director of Business Development at the Science and Technology Facilities Council (STFC), with a focus on the Daresbury Campus (see short bio below). Massimo is also a member of E-CAM’s Executive Board. In this interview, he will talk about his journey from academic research, to work in Unilever and now at STFC, and will share his insights on the use of simulation and modelling in industry and the role of STFC and research in this regard.
Watch Massimo Noro’s reply to three key questions of this interview:
Tell us about your journey from academic research, to work in Unilever and now at STFC
What are the key ingredients for the successful relationship between STFC and Industry
What do you think are the most important HPC needs for industry
Full video interview is available here, with the following outline:
What is the importance of diversity on the work space
Massimo Noro
Massimo Noro is the Director of Business Development at the Science & Technology Facilities Council (STFC), with a focus on the Daresbury Campus. His role is to ensure the continued growth and success of the Daresbury Laboratory at the Sci-Tech Daresbury Campus.
Massimo joined STFC in February 2018, following a successful industrial R&D career at Unilever with a proven track record as program and people leader in a corporate environment – Unilever is a large multinational and a market leader in home care, personal care, refreshments and foods products. He gained considerable experience in managing high-budget projects and in leading teams across sites and across complex organisations. Massimo leads on strategic partnerships with industry and local government; he manages a wide team to deliver innovation, to develop strong pipelines of commercial engagements and to provide a range of offerings for business incubation.
Donal MacKernan, University College Dublin & E-CAM
An E-CAM transverse action is the development of a protein based sensor (pending patent filled in by UCD[1,2]) with applications in medical diagnostics, scientific visualisation and therapeutics. At the heart of the sensor is a novel protein based molecular switch which allows extremely sensitive real time measurement of molecular targets to be made, and to turn on or off protein functions and other processes accordingly (see Figure 1). For a description of the sensor, see this piece.
One of the applications of the protein based sensor can be to detect influenza, by modifying the sensor to measure ‘up regulated Epidermal growth factor receptor’ (EGFR) in living cells. The interest of using it for the flu, is that it is cheap, easy to use in the field by non-specialists, and accurate – that is with very low false negatives and positives compared to existing field tests. UCD’s patent pending sensors have these attributes built into their ‘all-n-one’ design, through a novel type of molecular switch, that thrived in the laboratory proof of concept phase. A funded research project to continue this development at UCD is almost certain, and likely to start within weeks.
And the answer to the current frequently asked question “can we modify this sensor to quickly detect the COVID 19 ?” is yes, provided we know amino acid sequences of antibody -epitope pairs specific to this coronavirus.
Figure 1. Schematic illustration of a widely used sensor on the left of Komatsu et al[3] and the “all-n-one” UCD sensor on the right in the “OFF” and “ON” states corresponding to the absence and presence of the target biomarker respectively. The “all-n-one” substitutes the Komatsu flexible linker with a hinge protein with charged residues q1,q2,..which are symmetrically placed on either side of the centre so as to ensure that in the absence of the target, the Coulomb repulsion forces the hinge to be open. Their location and number can be adjusted to suit each application. The spheres B and B’ denote the sensing modules which tend to bind to each other when a target biomarker or analyte is present. The spheres A and A’ denote the reporting modules which emit a recognisable (typically optical) signal when they are close or in contact with each other i.e. in the presence of a target biomarker or analyte.
[1] EP3265812A2, 2018-01-10, UNIV. COLLEGE DUBLIN NAT. UNIV. IRELAND. Inventors: Donal MacKernan and Shorujya Sanyal. Earliest priority: 2015-03-04, Earliest publication: 2016-09-09. https://worldwide.espacenet.com/patent/search?q=pn%3DEP3265812A2
[2] WO2018047110A1, 2018-03-15, UNIV. COLLEGE DUBLIN NAT. UNIV. IRELAND. Inventor: Donal MacKernan. Earliest priority: 2016-09-08, Earliest publication: 2018-03-15. https://worldwide.espacenet.com/patent/search?q=pn%3DWO2018047110A1
[3] Komatsu N., Aoki K., Yamada M., Yukinaga H., Fujita Y., Kamioka Y., Matsuda M., Development of an optimized backbone of FRET biosensors for kinases and GTPases. Mol. Biol. Cell. 2011 Dec; 22(23): 4647-56.
GC-AdResS is a technique that speeds up computations without loss of accuracy for key system properties by dividing the simulation box into two or more regions having different levels of resolution, for instance a high resolution region where the molecules of the system are treated at an atomistic level of detail, and other regions where molecules are treated at a coarse grained level, and transition regions where a weighted average of the two resolutions is used. The goal of the E-CAM GC-AdResS pilot project was to eliminate the need of a transition region so as to significantly improve performance, and to allow much greater flexibility. For example, the low resolution region can be a particle reservoir (ranging in detail from coarse grained to ideal gas particles) and a high resolution atomistic region with no transition region, as was needed hitherto. The only requirement is that the two regions can exchange particles, and that a corresponding “thermodynamic” force is computed self-consistently, which it turns out is very simple to implement.
In the margins of a recent multiscale simulation workshop a discussion began between a prominent pharmaceutical industry scientist, and E-CAM and EMMC regarding the unfolding Fourth Industrial Revolution and the role of particle based simulation and statistical methods there. The impact of simulation is predicted to become very significant. This discussion is intended to create awareness of the general public, of how industry 4.0 is initiating in companies, and how academic research will support that transformation.
Authors: Prof. Pietro Asinari (EMMC and Politecnico di Torino, denoted below as PA) and Dr. Donal MacKernan (E-CAM and University College Dublin, denoted below as DM) , and a prominent pharmaceutical industry scientist (name withheld at author’s request as the view expressed is a personal one, denoted below as IS)