High Throughput Computing Workshop

E-CAM is organising a one week (16-20 July 2018) Extended Software Development Workshop in Turin, Italy that will focus on intelligent high throughput computing (HTC) as a technique that crosses many domains within the molecular simulation community in general and the E-CAM community in particular. The workshop will be a hybrid learning/coding event targeted at scientists with particular problems to solve. There will be 3 days of tutorial content presenting 3 different task management frameworks and 2 days code development time with the framework developers to help you integrate them into your application.

HTC is a computing paradigm focused on the execution of many loosely coupled tasks. It is a useful and general approach to parallelizing (nearly) embarrassingly parallel problems. Distributed computing middleware, such as Dask.distributed or COMP Superscalar (COMPSs), can include tools to facilitate HTC, although there may be challenges extending such approaches to the exascale.

Across scientific fields, HTC is becoming a necessary approach in order to fully utilize next-generation computer hardware. As an example, consider molecular dynamics: Excellent work over the years has developed software that can simulate a single trajectory very efficiently using massive parallelization. Unfortunately, for a fixed number of atoms, the extent of possible parallelization is limited. However, many methods, including semi-classical approaches to quantum dynamics, and some approaches to rare events, require running thousands of independent molecular dynamics trajectories. Intelligent HTC, which can treat each trajectory as a task and manage data dependencies between tasks, provides a way to run these simulations on hardware up to the exascale, thus opening the possibility of studying previously intractable systems.

In practice, many scientific programmers are not aware of the range of middleware to facilitate parallel programming. When HTC-like approaches are implemented as part of a scientific software project, they are often done manually, or through custom scripts to manage SSH, or by running separate jobs and manually collating the results. Using the intelligent high-level approaches enabled by distributed computing middleware will simplify and speed up development.

Furthermore, middleware frameworks can meet the needs of many different computing infrastructures. For example, in addition to working within a single job on a cluster, Dask.distributed and COMPSs include support for working through a cluster’s queueing system or working on a distributed grid. Moreover, architecting a software package such that it can take advantage of one HTC library will make it easy to use other HTC middleware. Having all of these possibilities immediately available will enable developers to quickly create software that can meet the needs of many users.

For more information and to register to attend this event, please click on this link.