Grid'5000 user report for Stephane Genaud

Jump to: navigation, search

User information

Stephane Genaud (users, user, account-manager, nancy, ml-users user)
More user information in the user management interface.


  • P2P-MPI (Middleware) [achieved]

    P2P-MPI est à la fois un intergiciel et une bibliothèque de communication pour programmes parallèles à passage de messages. L'intergiciel reprend les principes des systèmes pair-à-pair: chaque machine démarrée avec P2P-MPI devient un pair susceptible de partager sa CPU ou d'utiliser celles des autres. Cette approche confère de l'autonomie au système grâce à la découverte dynamique des pairs disponibles à tout moment pour exécuter un programme parallèle. La sélection des pairs les mieux adaptés dépend d'une stratégie multi-critères (latence réseau, concentration des processus sur le minimum de machines, présence des données sur les caches des pairs). La bibliothèque de communication est une implémentation de MPJ (adaptation de MPI pour Java) qui intègre la notion de tolérance aux pannes par réplication des calculs. L'utilisateur décide du taux de redondance de l'exécution et le système gère de manière transparente la cohérence de l'exécution. En cas de panne, l'application peut se poursuivre tant qu'il existe au moins un exemplaire de chaque processus intact.

    [English] (to be updated)
    P2P-MPI is a middleware framework which enables to form groups of computing resources to run parallel applications. A user can request P2P-MPI to transparently find processors to run his application (the discovery of resources is based on JXTA). The application must be developed in Java, using the P2P-MPI message passing library which conforms MPJ. P2P-MPI is oriented to desktop grids: it is runnable as a simple user and provides a transparent mechanism for fault-tolerance through replication of processes.

    • In 2008, we successfully tested our co-allocation strategies over more than 600 nodes involving 6 sites of Grid'5000.
    • In august 2005, we successfully ran p2p-mpi on 128 nodes of Grid5000. The ray-tracer application of the Java Grande benchmark suite has been benchmarked on two configurations. One was a 128 node all taken on the orsay site, the second configuration being 64 nodes at Orsay + 64 nodes at Sophia.

    More information here
  • Seismic tomography: Gimbos & Ray2mesh (Application) [in progress]
    Description: Gimbos and ray2mesh is a sofware suite for seismology. It contains a seismic ray-tracer, a mesh library dedicated to geophysics and a mesh refinement tool. The objective of this project is to build a realistic Earth model from seismic tomography by tracing all recorded seismic events since 1965.
    Results: We have already run our software on parts of the data set on parallel computers and on small grids. A larger experiment (tomography based on 1.1 million rays) has been made possible thanks to Grid'5000. Several configurations have been tested to assess the influence of the distance between sites and the number of processors on application scalability. Up to 458 processors taken from 5 sites have been involved in one run. The application showed a surprisingly good scalability (see experiment report published in IPDPS proc., High Performance Computing Workshop).
    More information here
  • Analysis of the AllToAllV in cluster of cluster environments (Networking) [in progress]
    Description: The context of this work is the total exchange of data between processes running in different clusters linked by a wide-area link (backbone). We can experiment such an environment using several sites of the Grid5000 testbed. Starting from the work done on the AllToAll operation we have extended the study to the AllToAllV operation. AllToAll in the MPI terminology means a total exchange of pieces of data of the same size between participant processes. AllToAllV implies that each piece of data may be of a different size. Unlike AllToAll, it is difficult to choose a good routing algorithm (e.g binomial tree, Bruck algorithm) depending on the data size for AllToAllV since the processes do not know the amount of data sent by the others. MPI implementations (OpenMPI, MPICH-2, GridMPI) have today a very simple implementation for AllToAllV, in which all processes send and receive asynchronously their data to all other processes and wait for the completion of all communications. Yet simple, this strategy performs better than any optimized routing scheme used in the other collective operations. We have tried more sophitiscated approaches based on message aggregation, congestion regulation and load-balance. In this kind of strategy, we select forwarders processes at each cluster, whose task is to aggregate single messages into a larger one sent over the backbone. The number of forwarders allows to control how many TCP streams simultaneously compete for the backbone. We can balance the size of the aggregated messages with an extra step at the begining where process first exchange the information about how much data they send. However, our proposal have only equaled the original AllToAllV implementation so far, bringing no significant improvement. One of the key finding of this work is that message aggregation does not improve the overall throughtput of the streams over the wide area link. Work is currently undergoing to model the behavior of the various routing strategies in the PLogP model.



    Success stories and benefits from Grid'5000

    • Overall benefits
    • Grid5000 is a unique tool to test software at a large scale. Gathering as many nodes from a distributed set of resources requires to fetch various, and ever changing resources from many institutions or individuals. On the contrary, Grid5000 enables to conduct experiments with identical sofwtare installations and has a large subset of similar nodes. Consequently, the reproducibility of experiments that was out of scope becomes a reality with the help of Grid5000.

    last update: 2011-06-22 11:20:00