Grid'5000 experiment

Jump to: navigation, search

KAAPI-Fault tolerance (Middleware)

Conducted by

Liyun Guelton, Serge Guelton


I use Grid5000 to test our Fault Tolerance (FT) system of KAAPI project. The objectif is to successfully execute KAAPI jobs on many nodes, while allowing that certain processes are killed or in failure during the execution (the system is capable to restart the failed processes). Also I use Grid5000 to test the parallel programs.


in progress


  • Nodes involved: 1000
  • Sites involved: >3
  • Minimum walltime: >1d
  • Batch mode: no
  • Use kadeploy: yes
  • CPU bound: no
  • Memory bound: no
  • Storage bound: no
  • Network bound: no
  • Interlink bound: no

Tools used

taktuk, karun, oarsub, oarsh, oardel...


Not yet

Shared by: Liyun Guelton, Serge Guelton
Last update: 2007-09-27 16:01:03
Experiment #416