Grid5000:Software: Difference between revisions

From Grid5000
Jump to navigation Jump to search
mNo edit summary
No edit summary
Line 14: Line 14:
|purpose=OAR is a resource manager (or batch scheduler) for large clusters. It allows cluster users to submit or reserve nodes either in an interactive or a batch mode.
|purpose=OAR is a resource manager (or batch scheduler) for large clusters. It allows cluster users to submit or reserve nodes either in an interactive or a batch mode.
|logo=[[Image:Logo_oar.png|150px|right]]
|logo=[[Image:Logo_oar.png|150px|right]]
|contacts=[mailto:olivier_DOT_richard_AT_imag_DOT_fr Olivier Richard]
|contacts=[mailto:olivier_DOT_richard_AT_imag_DOT_fr Olivier Richard], [mailto:pierre_DOT_neyron_AT_imag_DOT_fr Pierre Neyron]
|status=Production/Stable
|status=Production/Stable
|homepage=http://oar.imag.fr
|homepage=http://oar.imag.fr

Revision as of 11:24, 7 February 2015


Software mainly developed in Grid'5000 and available for its users.

System Software

Grid5000 Team

OAR2

Logo oar.png

OAR is a resource manager (or batch scheduler) for large clusters. It allows cluster users to submit or reserve nodes either in an interactive or a batch mode.

Kadeploy 3

Kadeploy.png

Kadeploy is a fast and scalable deployment system towards cluster and grid computing. It provides a set of tools, for cloning, configuring (post installation) and managing a set of nodes. Currently it deploys successfully linux, *BSD, Windows, Solaris on x86 and 64 bits computers.

KaVLAN

VLAN manipulation tool for network isolation of experiment

Grid5000 Community

TakTuk 3

TakTuk.png

TakTuk is a tool for deploying parallel remote executions of commands to a potentially large set of remote nodes. It spreads itself using an adaptive algorithm and sets up an interconnection network to transport commands and perform I/Os multiplexing/demultiplexing. The TakTuk mechanics dynamically adapts to environment (machine performance and current load, network contention) by using a reactive work-stealing algorithm that mix local parallelization and work distribution.

Katapult 3

Katapult is a small, well-tested script to automatically start experiments using deployments. Most experiments start by deploying the nodes, re-deploying the nodes if too many of them failed, copying the user's SSH key to the node, etc. Katapult automates all those tasks. This tool is available on most clusters under the name katapult3 and is compliant with kadeploy3.

Adage

ADAGE is an automatic deployment tool of applications in a grid environnement. It targets dynamic applications by providing a large set of services to deal with resources like information services, resources reservation/allocation, file transfert and job launching anfd monitoring.

ADAGE is internally based on a generic application description model (GADe) so as to support any kind of programming model and/or to support multi-programming model applications

Experiment Tools

Wrekavoc

The goal of Wrekavoc is to define and control the heterogeneity of a given platform by degrading CPU, network or memory capabilities of each node composing this platform. The degradation is done remotely, without restarting the hardware. The control is fine, reproducible and independent (one may degrade CPU without modifying the network bandwidth).

CoRDAGe

Logo CoRDAGe.jpg

Co-deployment and Re-deployment of Generic Applications.

NXE

Network eXperiment Engine is a tool written in Python to automate networking experiments in real testbeds environments. It allows to simply script experiments involving hundreds of nodes. The scenarios are described through XML files that provides a simple and hierarchical description of the topology, the general configuration and the interactions between the end-hosts.


Execo

Execo.png

Execo is an experiment toolkit. It offers a Python API for local or remote, standalone or parallel, processes execution. It is especially well suited for quickly scripting workflows of parallel/distributed operations on local or remote hosts: conducting experiments, performing automated tests, etc. It includes an API for dealing with oar, oargrid, kadeploy, and grid5000 API.

Development Environments / Middleware

Kaapi

KAAPI means Kernel for Adaptative, Asynchronous Parallel and Interactive programming. It is a C++ library that allows to execute multithreaded computation with data flow synchronization between threads. The library is able to schedule fine/medium size grain program on distributed machine. The data flow graph is dynamic (unfold at runtime).Target architectures are clusters of SMP machines.

DIET

[1] DIET means Distributed Interactive Engineering Toolbox. It is a C/C++ grid middleware, based on the GridRPC paradigm. DIET provides lots of mechanisms for simplifying the usage of a grid: adaptable job scheduling, performance prediction, data management (replication, persistency...), workflow management (either dataflow and workflows with conditional structures and loops), transparent job submission to batch schedulers (OAR, PBS, SGE, LoadLeveler...), transparent submission to cloud systems (Eucalyptus, Amazon EC2...).

Marcel

Marcel is a POSIX-compliant thread library featuring a programmable scheduler designed for hierarchical multiprocessor architectures.

Mad-MPI

Mad-MPI is an efficient implementation of MPI for fast networks.

MPICH-Madeleine

MPICH-Madeleine is an MPI implementation for clusters and clusters of clusters with heterogenous networks.

NewMadeleine

the NewMadeleine communication library provides extended capabilities for dynamic communication optimization on top of high performance networks.


Help.png Grid'5000 users, please send an email to web-staff if you would like to see a new software appear on this page.