From Grid5000
Jump to: navigation, search

Transparent, safe and efficient large scale computing (Computing)

Leaders: Stéphane Genaud (ICPS) and Fabrice Huet (OASIS)

With the emergence of new technologies real-life problem solving applications require more and more computer resources in many application domains such as high-energy physics, bioinformatics, space and earth observation, life sciences and health, nanotech- nology, and so on. To deal with such scale of requirement, more and more resources are aggregated into large scale computational distributed systems made of hundreds or thou- sands of resources belonging to one or several administration domains. These resources are not only heterogeneous by themselves (CPU, memory, OS, available software/services, etc.) but there are also heterogeneous in their connectivity (network properties, creden- tial, etc.). Moreover, they are volatile either because they belong to shared environments or because they fail. Users will benefit from such dramatic increase in potential compu- tational capabilities only if applications can efficiently and safely make use of such hostile environments.

Efficiently is closely related to the scalability of an application, i.e. its ability to speedup the execution time when the number of used processors increases (strong scal- ing) or its ability to compute larger datasets in a constant time using more processors (weak scaling). The main obstacles to scalability are the network latency that intercon- nects the different hardware components of the target infrastructure, and the heterogenity of the computing resources and network links. Further, volatility needs to be tackled to provide an acceptable environment. These issues should be addressed at several levels: at the application level, the algorithms should favor asynchrony and exploit hierarchy of resources using constructs provided at the programming model level, while the middle- ware level should hide technical details and glitches to applications so as to provide a transparent usage.

This working group aims to study approaches that allow to overcome the limitations that prevent applications to scale efficiently and safely. These limitations could have various origins: hardware, algorithms, programming models. From application and experimentation points of view, the focus will be on, but not limited to, large scale combinatorial optimization, stochastic control for energy management, molecular biology, hydrology and electromagnetism.