Hemera:WG:Methodology

From Grid5000
Revision as of 19:05, 2 March 2011 by Cperez (talk | contribs) (New page: == Completing challenging experiments on Grid’5000 (Methodology) == '''Leaders:''' ''Lucas Nussbaum (ALGORILLE), Olivier Richard (MESCAL)'' Experimental platforms like Grid’5000 or P...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Completing challenging experiments on Grid’5000 (Methodology)

Leaders: Lucas Nussbaum (ALGORILLE), Olivier Richard (MESCAL)

Experimental platforms like Grid’5000 or PlanetLab provide an invaluable help to the scientific community, by making it possible to run very large-scale experiments in con- trolled environment. However, while performing relatively simple experiments is generally easy, it has been shown that the complexity of completing more challenging experiments (involving a large number of nodes, changes to the environment to introduce heterogene- ity or faults, or instrumentation of the platform to extract data during the experiment) is often underestimated.

This working group will explore different complementary approaches, that are the basic building blocks for building the next level of experimentations on large scale exper- imental platforms. This encompasses several aspects.

First, designing a complex experiment raises strong methodological issues: the exper- imental scenario, as well as the experimental conditions and metrics, need to be carefully chosen. At this stage, it is also important to consider experimentation at real-scale as one element in a set of different evaluation methods, and the possibility to combine it with other complementary approaches such as modelling and simulation.

After the various parameters of the experiment have been chosen, the experiment need to be translated into a serie of actions on the platform. Reserving and controlling thousands of resources requires the use of specialized tools, such as scalable parallel launchers, data distribution solutions, and domain-specific languages able to express the numerous steps that compose experiments in an efficient way.

While Grid’5000 already provides some heterogeneity, it is often necessary to intro- duce more heterogeneity and/or changing experimental conditions during the experiment. For example, validating the ability of an application to cope with degraded network con- ditions, requires the ability to control the experimental environment in ways are that not available for standard Grid’5000 experiments. Techniques such as emulation or fault injection provides some answer to that problem, but there is still some work to be done before it they are easily usable on Grid’5000.

Finally, the extraction of results from experiments in Grid’5000 is a problem in itself: it requires tools that are both scalable (possibly collecting data from thousands of nodes), accurate, and non-intrusive (to limit the observer effect). While monitoring systems like Ganglia already provide some functionality, they have numerous limitations that prevent them from being usable in the context of Grid’5000 experiments.