Grid'5000 experiment

Jump to: navigation, search

KAAPI-Fault tolerance (Middleware)

Conducted by

Liyun Guelton, Serge Guelton

Description

I use Grid5000 to test our Fault Tolerance (FT) system of KAAPI project. The objectif is to successfully execute KAAPI jobs on many nodes, while allowing that certain processes are killed or in failure during the execution (the system is capable to restart the failed processes). Also I use Grid5000 to test the parallel programs.

Status

in progress

Resources

  • Nodes involved: 1000
  • Sites involved: >3
  • Minimum walltime: >1d
  • Batch mode: no
  • Use kadeploy: yes
  • CPU bound: no
  • Memory bound: no
  • Storage bound: no
  • Network bound: no
  • Interlink bound: no

Tools used

taktuk, karun, oarsub, oarsh, oardel...

Results

Not yet

Shared by: Liyun Guelton, Serge Guelton
Last update: 2007-09-27 16:01:03
Experiment #416

Personal tools
Namespaces

Variants
Views
Actions
Public Portal
Users Portal
Admin portal
Wiki special pages
Toolbox