Extended Software Development Workshop: Scaling Electronic Structure Applications

If you are interested in attending this event, please visit the CECAM website here.

Workshop Description

The evolutionary pressure on electronic structure software development is greatly increasing, due to the emergence of new paradigms, new kinds of users, new processes, and new tools. The large feature-full codes that were once developed within one field are now undergoing a heavy restructuring to reach much broader communities, including companies and non-scientific users[1]. More and more use cases and workflows are performed by highly-automated frameworks instead of humans: high-throughput calculations and computational materials design[2], large data repositories[3], and multiscale/multi-paradigm modeling[4], for instance. At the same time, High-Performance Computing Centers are paving the way to exascale, with a cascade of effects on how to operate, from computer architectures[5] to application design[6]. The disruptive paradigm of quantum computing is also putting a big question mark on the relevance of all the ongoing efforts[7].

All these trends are highly challenging for the electronic structure community. Computer architectures have become rapidly moving targets, forcing a global paradigm shift[8]. As a result, long-ignored and well-established software good practices that were summarised in the Agile Manifesto[9] nearly 20 years ago are now adopted at an accelerating pace by more and more software projects[10]. With time, this kind of migration is becoming a question of survival, the key for a successful transformation being to allow and preserve an enhanced collaboration between the increasing number of disciplines involved. Significant efforts of integration from code developers are also necessary, since both hardware and software paradigms have to change at once[11].

Two major issues are also coming from the community itself. Hybrid developer profiles, with people fluent both in computational and scientific matters, are still difficult to find and retain. On the long run, the numerous ongoing training initiatives will gradually improve the situation, while on the short run, the issue is becoming more salient and painful, because the context evolves faster than ever. Good practices have usually been the first element sacrificed in the “publish or perish” race. New features have usually been bound to the duration of a post-doc contract and been left undocumented and poorly tested, favoring the unsustainable “reinventing the wheel” syndrome.

Addressing these issues requires coordinated efforts at multiple levels:
– from a methodological perspective, mainly through the creation of open standards and the use of co-design, both for programming and for data[12];
– regarding documentation, with a significant leap in content policies, helped by tools like Doxygen and Sphinx, as well as publication platforms like ReadTheDocs[13];
– for testing, by introducing test-driven development concepts and systematically publishing test suites together with software[14];
– considering deployment, by creating synergies with popular software distribution systems[15];
– socially, by disseminating the relevant knowledge and training the community, through the release of demonstrators and giving all stakeholders the opportunity to meet regularly[16].

This is what the Electronic Structure Library (ESL)[17] has been doing since 2014, with a wiki, a data-exchange standard, refactoring code of global interest into integrated modules, and regularly organising workshops, within a wider movement lead by the European eXtreme Data and Computing Initiative (EXDCI)[18].

Since 2014, the Electronic Structure Library has been steadily growing and developing to cover most fundamental tasks required by electronic structure codes. In February 2018 an extended software development workshop will be held at CECAM-HQ with the purpose of building demonstrator codes providing powerful, non-trivial examples of how the ESL libraries can be used. These demonstrators will also provide a platform to test the performance and usability of the libraries in an environment as close as possible to real-life situations. This marks a milestone and enables the next step in the ESL development: going from a collection of libraries with a clear set of features and stable interfaces to a bundle of highly efficient, scalable and integrated implementations of those libraries.

Many libraries developed within the ESL perform low-level tasks or very specific steps of more complex algorithms and are not capable, by themselves, to reach exascale performances. Nevertheless, if they are to be used as efficient components of exascale codes, they must provide some level of parallelism and be as efficient as possible in a wide variety of architectures. During this workshop, we propose to perform advanced performance and scalability profiling of the ESL libraries. With that knowledge in hand it will be possible to select and implement the best strategies for parallelizing and optimizing the libraries. Assistance from HPC experts will be essential and is an unique opportunity to foster collaborations with other Centres of Excellence, like PoP (https://pop-coe.eu/) and MaX (http://www.max-centre.eu/).

Based on the successful experience of the previous ESL workshops, we propose to divide the workshop in two parts. The first two days will be dedicated to initial discussions between the participants and other invited stakeholders, and to presentations on state-of-the art methodological and software developments, performance analysis and scalability of applications. The remainder of the workshop will consist in a 12 days coding effort by a smaller team of experienced developers. Both the discussion and software development will take advantage of the ESL infrastructure (wiki, gitlab, etc) that was set up during the previous ESL workshops.

[1] See http://www.nanogune.eu/es/projects/spanish-initiative-electronic-simulations-thousands-atoms-codigo-abierto-con-garantia-y and
[2] See http://pymatgen.org/ and http://www.aiida.net/ for example.
[3] http://nomad-repository.eu/
[4] https://abidev2017.abinit.org/images/talks/abidev2017_Ghosez.pdf
[5] http://www.deep-project.eu/
[6] https://code.grnet.gr/projects/prace-npt/wiki/StarSs
[7] https://www.newscientist.com/article/2138373-google-on-track-for-quantum-computer-breakthrough-by-end-of-2017/
[8] https://arxiv.org/pdf/1405.4464.pdf (sustainable software engineering)
[9] http://agilemanifesto.org/
[10] Several long-running projects routinely use modern bug trackers and continuous integration, e.g.: http://gitlab.abinit.org/, https://gitlab.com/octopus-code/octopus, http://qe-forge.org/, https://launchpad.net/siesta
[11] Transition of HPC Towards Exascale Computing, Volume 24 of Advances in Parallel Computing, E.H. D’Hollander, IOS Press, 2013, ISBN: 9781614993247
[12] See https://en.wikipedia.org/wiki/Open_standard and https://en.wikipedia.org/wiki/Participatory_design
[13] See http://www.doxygen.org/, http://www.sphinx-doc.org/, and http://readthedocs.org/
[14] See https://en.wikipedia.org/wiki/Test-driven_development and http://agiledata.org/essays/tdd.html
[15] See e.g. http://www.etp4hpc.eu/en/esds.html
[16] See e.g. https://easybuilders.github.io/easybuild/, https://github.com/LLNL/spack, https://github.com/snapcore/snapcraft, and https://www.macports.org/ports.php?by=category&substr=science
[17] http://esl.cecam.org/
[18] https://exdci.eu/newsroom/press-releases/exdci-towards-common-hpc-strategy-europe


State-of-the-Art Workshop: Improving the accuracy of ab-initio predictions for materials

If you are interested in attending this event, please visit the CECAM website here.

Workshop Description

Ab-initio simulation methods are the major tool to perform research in condensed matter physics, materials science, quantum and molecular chemistry. They can be classified in terms of their accuracy and efficiency, but typically more accurate means less efficient and vice-versa. The accuracy depends mainly on how accurate one can solve the electronic problem. The most accurate algorithms are the wave-function based methods, such as Full CI, Coupled Cluster (CC), and Quantum Monte Carlo (QMC) followed by the Density Functional Theory-(DFT)-based methods and finally more approximate methods such as Tight-Binding. Another impor- tant consideration is how the accuracy of a given method scales with the size of the system under consideration. Among the wave-function based methods, the accuracy of traditional quantum chemistry methods can be sys- tematically improved but their scaling with system size limits their applicability to small molecules. On the other hand, QMC methods have a much more tractable scaling and have, in spite of the “fermion sign problem” and the commonly used fixed-node approximation, because the energies are variational upper bounds, a way of systematically improving the accuracy. Recently there has been much progress in the use of pseudopotentials and the systematic improvement of nodal surfaces using backflow, and multiple determinants. [1, 2, 3] Conversely DFT based methods are based on a plethora of different self-consistent mean field approxima- tions, each one tuned to best represent a class of systems but with limited transferability. Despite progress in developing more general functionals [4, 5, 6], DFT is missing an “internal” accuracy scale; its accuracy is gen- erally established against more fundamental theories (like CC or QMC) or against experiments. DFT methods are very popular because their favorable scaling with system size, the same as for QMC, but with a smaller prefactor.
In a number of recent applications [7, 8] it was found that inclusion of nuclear quantum effects (NQE) worsen considerably the agreement between DFT predictions and experiments. This is ascribed to the inac- curacies of DFT. This illustrates the importance of not using experimental data alone to improve the DFT functional but instead calculations using more fundamental methods. There has been a recent effort to establish the accuracy of DFT approximations by benchmarking with QMC calculations not only for equilibrium geome- tries but also for thermal configurations. This benchmarking can be customized for the individual molecules at a given temperature and pressure and geometry [9, 10, 11, 12].
Another important aspect concerns finite size effects in modelling extended systems. Although corrections can be developed for homogenous systems, for more complex situations with several characteristic length scales one needs to consider systems sizes that cannot be tackled by ab-initio methods. In these applications one needs to use an effective interaction energy. A recent development is the use of Machine Learning (ML) techniques to obtain energy functions with ab-initio accuracy [13, 14, 15]. Their transferability and accuracy assessment is still unsolved to some extent but progress is rapid. A related development is to use ML methods to by-passing the Kohn-Sham paradigm of DFT and directly address potential-density map [16, 17, 18]

The following is a list of topics that will be discussed during the meeting:
• Benchmarking existing DFT functionals with QMC. DFT has the potential to be accurate, but the main problem with its predictive power is that its accuracy can be system dependent. QMC was instrumental in developing the first exchange-correlation approximations (e. g. LDA), and we envisage that it can play a substantial role to help the discovery and tuning of new functionals. In particular, the tuning of dispersion interactions appears to be a crucial elements still not fully controlled in modern DFT approximations while it plays a crucial role in many systems like hydrogen and hydrogen based materials such as water.
• ML approaches with QMC accuracy. Machine Learning (ML) has attracted significant interest recently, mainly because of its potential to study real life systems, and also to explore the phase space at a scale that is not available to ab-initio methods. However, crucial for the ML method is the quality of the training set. It is often possible to train a ML potential on small systems, where accurate energies and forces can be obtained by quantum chemistry methods. However, training sets including larger systems are needed. QMC has the potential to provide them especially going forward with exascale computing.
• opportunity for new exascale applications of QMC to impact simulation for larger systems and longer time scale. QMC is capable of exploiting parallelism very efficiently, and is probably one of the few methods already capable of running at the exascale level. ML methods on large data set are also inherently parallel and directly usable on exascale machines.
• We will address the problem of using and testing the force field derived for a small systems to those of a much larger size.
• We will discuss the use of ML methods to derive new classes of wave functions for QMC calculations of complex systems.

[1] J. Kolorenc and L. Mitas, Rep. Prog. Phys. 74, 1 (2010).
[2] L. K. Wagner and D. M. Ceperley, Rep. Prog. Phys. 79, 094501 (2016).
[3] M. Taddei, M. Ruggeri, S. Moroni, and M. Holzmann, Phys. Rev. B 91, 115106 (2015).
[4] J. Heyd, G. Scuseria, and M. Ernzerhof, The Journal of Chemical Physics 118, 8207 (2003).
[5] K. Lee, É. Murray, L. Kong, B. Lundqvist, and D. Langreth, Physical Review B 82, 81101 (2010).
[6] K. Berland et al., Reports on Progress in Physics 78, 66501 (2015).
[7] M. A. Morales, J. McMahon, C. Pierleoni, and D. M. Ceperley, Physical Review Letters 110, 65702 (2013).
[8] M. Rossi, G. P, and M. Ceriotti, Physical Review Letters 117, 115702 (2016).
[9] R. C. Clay et al., Physical Review B 89, 184106 (2014).
[10] M. A. Morales et al., Journal of Chemical Theory and Computation 10, 2355 (2014).
[11] R. C. Clay, M. Holzmann, D. M. Ceperley, and M. A. Morales, Physical Review B 93, 035121 (2016).
[12] M. J. Gillan, F. Manby, M. Towler, and D. Alfè, The Journal of Chemical Physics 136, 244105 (2012).
[13] K. V. J. Jose, N. Artrith, and J. Behler, Journal of Chemical Physics 136, 194111 (2012).
[14] J. Behler, The Journal of Chemical Physics 145, 170901 (2016).
[15] V. Botu, R. Batra, J. Chapman, and R. Ramprasad, The Journal of Physical Chemistry C 121, 511 (2016).
[16] J. C. Snyder, M. Rupp, K. Hansen, K.-R. Mu ̈ller, and K. Burke, Physical Review Letters 108, 253002 (2012).
[17] L. Li, T. E. Baker, S. R. White, and K. Burke, Phys. Rev. B 94, 245129 (2016).
[18] F. Brockherde et al., arXiv:1609.02815v3 (2017).


A Conversation on Neural Networks, from Polymorph Recognition to Acceleration of Quantum Simulations


With Prof. Christoph Dellago (CD), University of Vienna, and Dr. Donal Mackernan (DM), University College Dublin.



Recently there has been a dramatic increase in the use of machine learning in physics and chemistry, including its use to accelerate simulations of systems at an ab-initio level of accuracy, as well as for pattern recognition. It is now clear that these developments will significantly increase the impact of simulations on large scale systems requiring a quantum level of treatment, both for ground and excited states. These developments also lend themselves to simulations on massively parallel computing platforms, in many cases using classical simulation engines for quantum systems.


Continue reading…


LibOMM : Orbital Minimization Method Library


The library LibOMM solves the Kohn-Sham equation as a generalized eigenvalue problem for a fixed Hamiltonian. It implements the orbital minimization method (OMM), which works within a density matrix formalism. The basic strategy of the OMM is to find the set of Wannier functions (WFs) describing the occupied subspace by direct unconstrained minimization of an appropriately-constructed functional. The density matrix can then be calculated from the WFs. The solver is usually employed within an outer self-consistency (SCF) cycle. Therefore, the WFs resulting from one SCF iteration can be saved and then re-used as the initial guess for the next iteration.

More information on the module’s documentation can be found here, and the source code is available from the E-CAM Gitlab here. The algorithms and implementation of the library are described in https://arxiv.org/abs/1312.1549v1.

This module is an effort from the Electronic Structure Library Project (ESL), and it was initiated during an E-CAM Extended Software Development Workshop in Zaragoza in June 2016. This and other codes revolved around the broad theme of solvers, were recently reported in Deliverable D2.1.: Electronic structure E-CAM modules I, available for download and consultation here.

Practical application and exploitation of the module

libOMM is one of the libraries supported and enhanced by the Electronic Structure Infrastructure ELSI [1], which in turn is interfaced with the DGDFT, FHI-aims, NWChem, and SIESTA codes.

[1] The electronic structure infrastructure ELSI  provides and enhances scalable, open-source software library solutions for electronic structure calculations in materials science, condensed matter physics, chemistry, molecular biochemistry, and many other fields [https://arxiv.org/abs/1705.11191v1].


Solvers for quantum atomic radial equations

SQARE (solvers for quantum atomic radial equations) is a library of utilities intended for dealing with functions discretized on radial meshes, wave-equations with spherical symmetry and their corresponding quantum states. The utilities are segregated into three levels: radial grids and functions, ODE solvers, and states.

For more information see modules SQARE radial grids and functions, SQARE ODE and SQARE states documentations.


Scoping Workshop: From the Atom to the Molecule

If you are interested in attending this workshop, please visit the CECAM website bellow.


Extended Software Development Workshop: Wannier90

The aim of the workshop is to share recent developments related to the generation and use of maximally-localised Wannier functions and to either implement these developments in, or interface them to, theWannier90 code. It will also be an opportunity to improve and update existing interfaces to other codes and write new ones. The format will be deliberately open, with the majority of the time allocated for coding and discussion.


State of the art workshop: Electronic Structure

This is the third state of the art workshop for 2016. It is organised by the CECAM-UK-HARTREE node and will focus on electronic structure. Scoping workshops provide a forum to survey new methods and developments in simulation. These workshops inform the software that will be developed for the E-CAM library.


Electronic Structure Library Coding Workshop

This is the first E-CAM Extended Software Development Workshop, taking place in Zaragoza in Spain. The Electronic Structure Library  is a new project to build a community-maintained library of software of use for electronic structure simulations. The goal is to create an extended library that can be employed by everyone for building their own packages and projects.