Grid'5000 user report for
User information( user)
More user information in the user management interface.
- Kaapi - scalability study (Middleware) [in progress]
Description: KAAPI means Kernel for Adaptative, Asynchronous Parallel and Interactive programming. It is a C++ library that allows to execute multithreaded computation with data flow synchronization between threads. The library is able to schedule fine/medium size grain program on distributed machine. The data flow graph is dynamic (unfold at runtime). Target architectures are clusters of SMP machines and heterogeneous grid. KAAPI prones fine grain recursive multithreaded programming. Such programs runs on various architectures with various number of processors within high efficiency (the number of processors is limited by the average parallelism of the program). Moreover such efficient execution is part of a predictive cost model which gives the relation between the time Tp of an execution on p processors with the time T1 of the execution on 1 processor: Tp ≤ T1 / p + T∞
Results: Grid5000 was used to experiment scalability of the KAAPI implementation. Experiments is in progress. The library is ported to various architecture (Linux, MacOS X, ia32, ia64, opteron and ppc). Experiments on various applications will be the next step.
More information here
- Heterogeneous communication support for the Kaapi middleware (Middleware) [achieved]
Description: KAAPI means Kernel for Adaptative, Asynchronous Parallel and Interactive programming. It is a C++ library that allows to execute multithreaded computation with data flow synchronization between threads. The library is able to schedule fine/medium size grain program on distributed machine. The data flow graph is dynamic (unfold at runtime). We have worked on developping heterogeneous communication inside Kaapi.
More information here
- Fault Tolerance support for Kaapi (Middleware) [achieved]
Description: To provide transparent fault tolerant support, Kaapi implements a full stack of tools such as fault detectors, checkpoint servers, checkpoint and recovery protocols, ... These research mainly focus on offering original checkpoint and recovery protocols based on the application knowledge given by its data flow graph. Two protocols have been designed: TIC and CCK. - TIC (Theft-Induced Checkpointing) is a specialization of the classical Communication-Induced Checkpointing (CIC) protocol for the work-stealing scheduling. - CCK (Coordinated Checkpointing in Kaapi) is a improvement of the classical coordinated checkpoint protocol that allow a partial recovery in case of failure. It uses the data flow graph of the application and its dependencies to determine the set of tasks to restart the application correctly. It aims principally iterative simulations based on a domain decomposition method. The purpose of these experiments is to validate and evaluate these fault tolerance protocols in real conditions.
- Bob++ - Kaapi (Combinatorial optimization) (Application) [planned]
Description: With the ACI DOCG we have parallelized an branch and cut algorithm to solve the quadratic assignement problem. The experiment will consists on retreiving previous published results on very big instance that required about 13 years of sequential CPU time. The experiment will required the achievement of fault tolerance experiment and scalability of the middle experiments.
More information here
- Un protocole de sauvegarde / reprise coordonné pour les applications à flot de données reconfigurables  (national)
EntryType: article Author: Besseron, Xavier and Pigeon, Laurent and Gautier, Thierry and Jafar, Samir Journal: Technique et Science Informatiques (TSI) Volume: 27 Publisher: Hermès Url: http://tsi.revuesonline.com/article.jsp?articleId=11706
- Un protocole de sauvegarde / reprise coordonné pour les applications à ﬂot de données reconﬁgurables  (national)
EntryType: inproceedings Author: Xavier Besseron and Laurent Pigeon and Thierry Gautier and Samir Jafar Booktitle: Rencontres francophones du Parall\'elisme (RenPar'17) Address: Perpignan, France Month:
- Modèle de Coût Algorithmique Intégrant des Mécanismes de Tolérance aux Pannes et Expérimentations  (national)
EntryType: inproceedings Author: Jafar Samir and Gautier Thierry and Roch Jean-Louis Booktitle: Proceedings des 16emes rencontres francophones du parallélisme (RenPar'16) Address: Le Croisic, France Pages: 125-136 Month: April Day: 6--8
- CCK: An Improved Coordinated Checkpoint/Rollback Protocol for Dataflow Applications in KAAPI  (international)
EntryType: conference Author: X. Besseron and S. Jafar and T. Gautier and J.-L. Roch Booktitle: ICTTA'06 IEEE Conference on Information and Communication Technologies: from Theory to Applications Address: Damascus, Syria Month: Editor: IEEE Pages: 3353--3358 Pdf: http://ieeexplore.ieee.org/iel5/11100/35471/01684955.pdf?tp=&arnumber=1684955&isnumber=35471 Url: http://ictta.enst-bretagne.fr/
- A Checkpoint/Recovery Model for Heterogeneous Dataflow Computations Using Work-Stealing  (international)
EntryType: inproceedings Author: Jafar, Samir and Gautier, Thierry and Krings, Axel W. and Roch, Jean-Louis Address: Lisboa, Portogal Booktitle: EUROPAR'2005 Editor: Springer-Verlag, LNCS Month: August
- Theft-Induced Checkpointing for Reconfigurable Dataflow Applications  (international)
EntryType: inproceedings Author: Jafar, Samir and Krings, Axel W. and Gautier, Thierry and Roch, Jean-Louis Note: This paper received the EIT'05 Best Paper Award Address: Lincoln, Nebraska Booktitle: IEEE Electro/Information Technology Conference , (EIT 2005) Editor: IEEE Month: May Day: 22--25