Does our simulation community need EXASCALE ?

By Prof. David Ceperley, University of Illinois Urbana-Champaign

The computer simulation of electrons, atoms, molecules, and their assemblies in soft and hard matter is foundational for many scientific disciplines and important commercially. Exascale computing is coming and our community should take part as are our colleagues in lattice gauge theory, climate modeling, cosmology, genomics and other disciplines.

These resources present a great opportunity. The proposed exascale machines will be massively parallel with millions of interconnected processors. They will likely require new programming and languages to use optimally. Their memory will be such that one will need to explicitly control how data is stored and accessed since the interconnect bandwidth and latency will limit the use of global memory. The funders of the new machines will want the machines to be used to their full potential and to generate science commensurate with their costs. Clearly not every simulation algorithm will be useful in this environment.

Applications need to promise more than incremental progress
We have to think which applications warrant this resource and which are doable in the next decade. Applications need to promise more than incremental progress, e.g. not just another decimal place on the correlation energy of the homogeneous electron gas. Most of our current projects should stay on their current platforms. But every so often, there is a paradigm shift, where something that was once thought too difficult or impossible becomes routine. Such occurred in 1985 when R. Car and M. Parrinello [1] linked Molecular Dynamics with a DFT evaluation of the Born-Oppenheimer forces. We must look for such opportunities provided by the increased computer power of exascale machines.

Material design by computer
One candidate for an application (really a whole family of applications) is material design by computer. The US Material Genome Initiative [2] and other similar efforts worldwide have been funded to work towards this goal. Materials design, currently done experimentally, for example with a “shake and bake” procedure, is technologically and commercially very important — but costly. There has been much progress during the past decade but its promise is still largely in the future. Our community, for example as is present in E-CAM, is focused on accurate predictions of semi-empirical models of physical systems. I believe that materials design will require calculations that can be run without experimental input with a reasonable probability of a successful prediction (i.e. that a certain structure can be made, that it will be stable and have predicted properties). While it is true that computer design has made a small impact to date, eventually it will, even though we don’t know when.

Computer design of materials is a good candidate for an exascale application because one can do searches in parallel; each simulation/electronic structure calculation occupying a small part of the machine. The power of the machine can allow thousands or millions of candidate structures to be examined in parallel. But some things need to happen first.

Accuracy of the electronic structure a fundamental consideration
One consideration is the fundamental accuracy of the electronic structure calculation. A key property that we need is the Born-Oppenheimer surface of the ions to determine the stable crystal structure, and its electronic properties such as the response to electromagnetic fields. To be accurate one needs to resolve the energy differences between different structures and energy barriers, since without knowing the structure one cannot even begin to describe its properties. Room temperature provides a typical energy scale. However, since (100 K = 0.3mH = 8meV), we are interested in very small energy differences relative to typical electronic energies (1 Hartree=27.2eV=315 775K). Although we require quite accurate energies, current methods are getting close for many physical systems!

Currently, for structure searching one often uses DFT calculations for stability. However, there is now a multitude of DFT functionals, and it’s not obvious which one is best. Without empirical information one cannot decide. A recent article [3] suggests that the current semi-empirical approach to improving functionals does not lead to systematic path toward the exact functionals. Today’s best functionals do not typically meet the accuracy criterion without empirical tuning and selection. For example, we cannot even make confident predictions of the ground state structure of solid hydrogen[4], the first element and one of the simplest elements. DFT is good for interpolating between materials where the accuracy has been confirmed experimentally. However, the space of potential materials is so vast that one cannot rely on a semi-empirical method. It is likely that the best material for a given application will be made of a particular combination of elements that has not been looked at with high quality experiments or high accuracy electronic structure methods and would not be in the data base that is used to construct the functional or model.

Quantum Monte Carlo methods
I am an advocate of Quantum Monte Carlo (QMC) methods [5]. These methods are the generalizations of Molecular Dynamics (MD) and Monte Carlo (MC) to quantum many-body systems and are particularly needed when mean-field based methods fail. For some systems, QMC methods are exact in the sense that classical MD and MC are exact, but to simulate electrons one runs into the fermion sign problem. No algorithm has yet been demonstrated that gives a controlled error in polynomial computer time as the number of electrons goes to infinity. But because the fixed-node or fixed-phase methods give upper bounds for the energy in polynomial time, we have unambiguous internal information about the accuracy of the fixed-node estimate, and we know when we have an improvement. In addition, exact (controlled) estimates can be performed for small electron systems. Path integral methods can treat non-zero temperatures and quantum nuclear effects. QMC is the most general, robust algorithm for solving the equilibrium electronic structure problem and can be shown to reach the requisite accuracy in many cases. In addition, the stochastic nature of its procedure can be incorporated together with a classical MC simulation without seriously impacting the computational effort. This allows one to study a disordered system such as dense liquid hydrogen [6] with a higher and better-controlled accuracy than would be obtained using DFT forces.

I mention QMC here because it is a leading exascale electronic structure algorithm. Diffusion Monte Carlo works by evolving independent walkers using the Hamiltonian. Evaluation of the trial wave function, its gradient and Laplacian dominate the computer effort. The independence of the walker’s evolution results in nearly perfect parallelism for ten of thousands of processors. In most applications one does not need millions of walkers but more parallelism can be achieved by simultaneously looking at different compounds or boundary conditions. Of course, to make the QMC method more applicable there are many technical problems to solve in addition to the sign problem: e.g., the elimination of core electrons in a more accurate way, and better scaling to more electrons. Certainly other methods are indispensable but QMC provides a benchmark of their accuracy if experiment is not available.

Difficulty of going from the nano to the mesoscale
A second problem, I want to mention is the difficulty of going accurately, robustly and automatically from the nanoscale to the mesoscale. Since the goal is to fabricate materials simulation techniques must be able to handle the typical complexity of real materials that include defects, impurities, and dynamical processes. Material designers need to consider effects of non-zero temperature, entropy, electronic and optical properties and formation routes. It is not realistic to think that an expert in the theory and practice of electronic structure will be always involved in a materials design project; one needs a “black box” solution to the multiscale problem. The accuracy of an electronic structure calculation needs to be extended to large systems in an automatic way. One approach is to use accurate electronic structure methods that are limited to small systems (say fewer than a thousand electrons) to generate data that can be used to generate potential energy surfaces that can be used in MD [7] or with another model. The MD simulation can then be run for millions or billions of atoms and to make estimates of some of the needed properties at a much larger length or longer time scale. This needs to be done routinely but tailored for a particular application. An important obstacle is to find a basis set appropriate to describe complex molecular interactions that is universal, accurate and reasonably compact. Whether this can be achieved in general and still maintain the required accuracy is an open question.

An organizational problem of material design is that research and development cuts across different research communities: it needs a larger, longer-scale effort such as could be provided by E-CAM. Porting and optimization of a single code to the new machines could take several person-years of work; global cooperation will be advantageous.

Changes in computer architecture are annoying and upsetting. In my career I have seen more than a dozen of such shifts each requiring a big investment to stay current. However, I am confident that the simulation community will stay involved with high performance computing and, as a consequence, our community will reap the intellectual, scientific and financial rewards.

References
[1] R. Car and M. Parrinello, Phys. Rev. Letts. 55, 2471 (1985).
[2] https://www.whitehouse.gov/mgi
[3] M. G. Medvedev et al. Science 355, 49 (2017); S. Hammes-Schiffer, Science
[4] M. A. Morales, Phys. Rev. B 87, 184107 ( 2013).
[5] R. M. Martin, L. Reining and D. M. Ceperley, Interacting Electrons, Theory and Computational Approaches Cambridge (2016); L. K. Wagner and D. M. Ceperley, Rep. Prog. Phys. 79 094501 (2016).
[6] C. Pierleoni et al., Proc. Nat. Acad. Science (US) 113, 4953–4957 (2016).
[ 7] M. J. Gillan et al , J. Chem. Phys. 139, 114101 (2013).