Another successful online training event !

 

Our last Extended Software Development Workshop (ESDW) took place on the 18th-22nd January[1], and given its length (5 days) and it’s nature (theory and hands-on training sessions) it was a real success! “The workshop went very well, participants seem to have enjoyed and they lasted until the end !”, said organiser Jony Castagna, computational scientist and E-CAM programmer at UKRI STFC Daresbury Laboratory. The event, organised at the CECAM-UK-DARESBURY Node[2], focused on HPC for mesoscale simulation, and aimed at introducing participants to Dissipative Particle Dynamics (DPD) and the mesoscale simulation package DL_MESO [3] (DL_MESO_DPD). DL_MESO is developed at UKRI STFC Daresbury by Michael Seaton, computational chemist at Daresbury and also an organiser of this event.

Another component of this workshop was parallel programming of hybrid CPU-GPU systems. In particular, DL_MESO has recently been ported to multi-GPU architectures[4] and runs efficiently up to 4096 GPUs, an effort supported by E-CAM (thank you Jony!). Part of this workshop was dedicated to theory lectures and hands-on sessions on GPU architectures and OpenACC (NVidia DLI course) given by Jony, which is an NVidia DLI Certified Instructor. He said “The intention is not only to port mesoscale solvers on GPUs, but also to expose the community to this new programming paradigm, which they can benefit from in their own fields of research”.

All sessions in this ESDW were followed by discussions and hands-on exercises. Organisers were supported by another STFC colleague and former E-CAM post-doc Silvia Chiacchiera. One of the participants wrote “Thank you so much for your effort. This workshop will cause a significant shift in my thinking and approach”.

21 people registered for to the event; but by the third day there were only 9… from which 5 lasted until the last session! A picture taken from the last session talks by itself 🙂

Do you want to join our next training event ? Check out our programme :

Full calendar at https://www.e-cam2020.eu/calendar/.

 

References

[1] https://www.cecam.org/workshop-details/8

[2] https://www.cecam.org/cecam-uk-daresbury

[3] Seaton M.A. et al. “DL_MESO: highly scalable mesoscale simulations”, Molecular Simulation 2013, 39 http://www.cse.clrc.ac.uk/ccg/software/DL_MESO/

[4] J. Castagna, X. Guo, M. Seaton and A. O’Cais, “Towards extreme scale dissipative particle dynamics simulations using multiple GPGPUs”,
Computer Physics Communications, 2020, 107159
DOI: 10.1016/j.cpc.2020.107159

Share

LearnHPC: dynamic creation of HPC infrastructure for educational purposes

 

Abstract

In a newly successful PRACE-ICEI proposal, E-CAM, FocusCoE, HPC Carpentry and EESSI join forces to bring HPC resources to the classroom in a simple, secure and scalable way. Our plan is to reproduce the model developed by the Canadian open-source software project Magic Castle. The proposed solution creates virtual HPC infrastructure(s) in a public cloud, in this case on the Fenix Research Infrastructure, and generates temporary event-specific HPC clusters for training purposes, including a complete scientific software stack. The scientific software stack is fully optimised for the available hardware and will be provided by the European Environment for Scientific Software Installations (EESSI). 

Description 

EU-wide requirements for HPC training are exploding as the adoption of HPC in the wider scientific community gathers pace. However, the number of topics that can be thoroughly addressed without providing access to actual HPC resources is very limited, even at the introductory level. In cases where such access is available, security concerns and the overhead of the process of provisioning accounts make the scalability of this approach questionable.

EU-wide access to HPC resources on the scale required to meet the training needs of all countries is an objective that we attempt to address with this project. The proposed solution essentially provisions virtual HPC system(s) in a public cloud, in this case on the Fenix Research Infrastructure. The infrastructure will dynamically create temporary event-specific HPC clusters for training purposes, including a scientific software stack. The scientific software stack will be provided by the European Environment for Scientific Software Installations (EESSI) which uses a software distribution system developed at CERN, CernVM-FS, and makes a research-grade scalable software stack available for a wide set of HPC systems, as well as servers, desktops and laptops (including MacOS and Windows!). 

The concept is built upon the solution of Compute Canada, Magic Castle, which aims to recreate the Compute Canada user experience in public clouds (there is even a presentation where the main developer creates a cluster just by talking to his phone!). Magic Castle uses the open-source software Terraform and HashiCorp Language (HCL) to define the virtual machines, volumes, and networks that are required to replicate a virtual HPC infrastructure. 

In addition to providing a dynamically provisioned HPC resource, the project will also offer a scientific software stack provided by EESSI. This model is also based on a Compute Canada approach and enables replication of the EESSI software environment outside of any directly related physical infrastructure. 

Our adaption of Magic Castle aims to recreate the EESSI HPC user experience, for training purposes, on the Fenix Research Infrastructure.  After deployment, the user is provided with a complete HPC cluster software environment including a Slurm scheduler, a Globus Endpoint, JupyterHub, LDAP, DNS, and a wide selection of research software applications compiled by experts with EasyBuild.

The architecture of the solution is best represented by the graphic below (taken from the Compute Canada documentation at https://github.com/ComputeCanada/magic_castle/tree/master/docs):

Cloud Cluster Architecture Overview ©Magic Castle (https://github.com/ComputeCanada/magic_castle)

With the resources made available to the project, we plan to run 6 HPC training events from January to July 2021. These training events are connected to the Centres of Excellence E-CAM and FocusCoE and with HPC Carpentry.

Share

The launch of the E-CAM Online Training Portal

 

We are pleased to announce that our E-CAM training portal is now online. Access instructions here.

The goals and expected impacts for our online training infrastructure are to:

  •   Collect the content captured at our Extended Software Development Workshops (ESDWs), allowing participants to re-visit lectures or demonstrations in their own time, both during and after the meeting. Such material can also be used by people who did not have the opportunity to attend the ESDW in person (particularly interested industries);
  •   Generate online training modules for each ESDW, which will be a set of preparatory materials shared with the participants of the event and that will allow everyone to acquire the same basic knowledge before the meeting;
  •   Be a repository for the data associated to our events, such as captured lectures, lecture materials, reading materials, tutorial content and software requirements;
  •   Build tutorials on programming best practices to develop software for extreme-scale hardware, that we can propose to the extended E-CAM community;
  •   Associate with other groups and projects with similar training scope, to cover for different and broader training material.

 

Information on the access to the portal, terminology and instructions for ESDW participants is at this link. The content of the training portal  is freely available upon registration, but we also keep a selection of publicly available lectures accessible directly from the E-CAM website.

 

Share