Hemera:WG:Scheduling

From Grid5000
Jump to navigation Jump to search

Efficient exploitation of highly heterogeneous and hierarchical large-scale systems (Scheduling)

Leaders: Olivier Beaumont (CEPAGE), Frédéric Vivien (GRAAL)

Actual computing platforms tend to be hierarchical aggregations on a large scale of local systems of very different characteristics. Most of the existing works deal with triv- ially parallel problems made of a large number of independent tasks. Many applications, however, are an aggregation of interdependent subtasks that exchange intermediate re- sults. Also, most algorithmic solutions rely on some centralized mechanism, and are thus not scalable, and rely on a precise and comprehensive knowledge of the platforms and applications, which are supposed to have static characteristics. The challenge is then to efficiently use such platforms to solve very demanding computing applications. To address this challenge, many different problems must be addressed:

  • Because of the nature of the platforms, proposed solutions should not be centralized but rather distributed, or even use a peer-to-peer architecture. As, in practice, our a priori knowledge on applications and platforms is fragmentary and approximate, and many characteristics evolve dynamically, the efficient use of these platforms requires the use of robust algorithmic solutions, that is, which are able to cope with uncertainties and dynamicity. Developing such solutions can, for instance, include techniques such as sensitivity analysis and stochastic scheduling. New paradigms such as game theory should also be explored.
  • As proposed algorithmic solutions must be practical, they must be based on realistic platform models, and especially realistic models of interconnection networks. The relevant models must be identified, and then the algorithmic implications studied.
  • As data induce communications can greatly impact the applications running times, special care must be put on data localizations so as to minimize data movements, or at least contentions.
  • In order to efficiently handle non trivially parallel applications, strategies must be designed to efficiently schedule task graphs and workflows.