Grid5000:Home

From Grid5000
Jump to: navigation, search
Grid'5000

Grid'5000 is a large-scale and flexible testbed for experiment-driven research in all areas of computer science, with a focus on parallel and distributed computing including Cloud, HPC and Big Data and AI.

Key features:

  • provides access to a large amount of resources: 15000 cores, 800 compute-nodes grouped in homogeneous clusters, and featuring various technologies: PMEM, GPU, SSD, NVMe, 10G and 25G Ethernet, Infiniband, Omni-Path
  • highly reconfigurable and controllable: researchers can experiment with a fully customized software stack thanks to bare-metal deployment features, and can isolate their experiment at the networking layer
  • advanced monitoring and measurement features for traces collection of networking and power consumption, providing a deep understanding of experiments
  • designed to support Open Science and reproducible research, with full traceability of infrastructure and software changes on the testbed
  • a vibrant community of 500+ users supported by a solid technical team


Read more about our teams, our publications, and the usage policy of the testbed. Then get an account, and learn how to use the testbed with our Getting Started tutorial and the rest of our Users portal.

Grid'5000 is merging with FIT to build the SILECS Infrastructure for Large-scale Experimental Computer Science. Read an Introduction to SILECS (April 2018)


Recently published documents and presentations:

Older documents:


Grid'5000 is supported by a scientific interest group (GIS) hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations. Inria has been supporting Grid'5000 through ADT ALADDIN-G5K (2007-2013), ADT LAPLACE (2014-2016), and IPL HEMERA (2010-2014).


Current status (at 2021-03-02 17:22): 2 current events, 1 planned (details)


Random pick of publications

Five random publications that benefited from Grid'5000 (at least 2143 overall):

  • Ji Liu, Noel Moreno Lemus, Esther Pacitti, Fábio Porto, Patrick Valduriez. Parallel Computation of PDFs on Big Spatial Data Using Spark. Distributed and Parallel Databases, Springer, 2020, 38, pp.63-100. 10.1007/s10619-019-07260-3. lirmm-02045144 view on HAL pdf
  • Zaineb Chelly Dagdia, Christine Zarges, Gaël Beck, Mustapha Lebbah. A scalable and effective rough set theory-based approach for big data pre-processing. Knowledge and Information Systems (KAIS), Springer, 2020, 10.1007/s10115-020-01467-y. hal-02880626 view on HAL pdf
  • Nouredine Melab, Jan Gmys, Mohand Mezmaz, Daniel Tuyttens. Many-core Branch-and-Bound for GPU accelerators and MIC coprocessors. T. Bartz-Beielstein; B. Filipic; P. Korosec; E-G. Talbi. High-Performance Simulation-Based Optimization, 833, Springer, pp.16, 2019, Studies in Computational Intelligence, ISBN 978-3-030-18763-7. hal-01924766 view on HAL pdf
  • Guillaume Briffoteaux, Romain Ragonnet, Mohand Mezmaz, Nouredine Melab, Daniel Tuyttens. Evolution Control for parallel ANN-assisted simulation-based optimization application to Tuberculosis Transmission Control. Future Generation Computer Systems, Elsevier, 2020, 113, pp.454-467. 10.1016/j.future.2020.07.005. hal-02904840 view on HAL pdf
  • Neil Ayeb, Eric Rutten, Sebastien Bolle, Thierry Coupaye, Marc Douet. Coordinated autonomic loops for target identification, load and error-aware Device Management for the IoT. FedCSIS 2020 - 15th Federated Conference on Computer Science and Information Systems, Sep 2020, Sofia, Bulgaria. pp.1-10. hal-02934785 view on HAL pdf


Latest news

Rss.svgKwapi service is going to be stopped

Kwapi, the legacy service to provide energy consumption monitoring under Grid'5000, is going to be retired. It will be stopped in two weeks.

We encourage you to adopt its successor: Kwollect

https://www.grid5000.fr/w/Monitoring_Using_Kwollect

-- Grid'5000 Team 08:30, February 10th 2021 (CET)

Rss.svgIBM POWER8 cluster "drac" with P100 GPUs fully available in Grenoble

After a few months of testing, we are happy to announce that the new cluster "drac" is now available in the default queue in Grenoble.

The cluster has 12 "Minsky" nodes from IBM, more precisely "Power Systems S822LC for HPC".

Each node has 2x10 POWER8 CPU cores, 4 Tesla P100 GPU, and 128 GB of RAM. The GPUs are interconnected pair-wise with a NVLINK fabric that is also connected to the CPUs (unlike the DGX-1 gemini nodes for instance). The nodes are interconnected with a high-speed Infiniband network with 2x100 Gbit/s on each node.

Be aware that this cluster is using an exotic architecture, and as such it may be more difficult than usual to install software.

In particular, if you need to install to deep learning frameworks on these nodes, make sure to read our documentation:

Deep_Learning_Frameworks#Deep_learning_on_ppc64_nodes

In addition, when reserving the nodes, you need to explicitly ask for an "exotic" job type. For instance, to obtain a single node:

grenoble> oarsub -I -t exotic -p "cluster='drac'"

Acknowledgment: This cluster was donated by [GENCI https://www.genci.fr/en] (it was formerly known as the Ouessant platform of Genci's Cellule de Veille Technologique), many thanks to them.

More information on the hardware is available at Grenoble:Hardware#drac

-- Grid'5000 Team 17:30, February 9th 2021 (CET)

Rss.svgTroll and Gemini clusters are now exotic resources (change in the way to reserve them)

Clusters Troll on Grenoble and Gemini on Lyon are now considered exotic resources, and must be reserved using the exotic OAR job type.

When a cluster on Grid'5000 has a hardware specificity that makes it too different from a "standard" configuration, it is reservable only using the exotic OAR job type. There are 2 reasons for this:

  • It ensures that your experiment won't run on potentially incompatible hardware, unless you explicitly allow it. (for example, you don't want to get a aarch64 cluster if your experiment is built for x86)
  • By not allocating these resources to jobs by default, it makes them more easily available for users who are looking specifically for this kind of hardware.

There is an example of usage of the exotic job type in the getting started : https://www.grid5000.fr/w/Getting_Started#Selecting_specific_resources

You can see if a cluster is exotic in the reference API or on the Hardware page of the wiki : https://www.grid5000.fr/w/Hardware#Clusters

There are currently 4 clusters which needs the exotic job type to be reserved :

  • pyxis because it has a non-x86 CPU architecture (aarch64)
  • drac because it has a non-x86 CPU architecture (ppc64)
  • troll because it has PMEM ( https://www.grid5000.fr/w/PMEM )
  • gemini because it has 8 V100 GPU per node, and only 2 nodes

-- Grid'5000 Team 09:40, January 26th 2021 (CET)

Rss.svgNew Grid'5000 API's documentation and specification

Grid'5000 API's documentation has been updated. Before this update, the documentation contained both the specification and tutorials of the API (with some parts also present in the wiki).

To be more consistent, https://api.grid5000.fr/doc/ provides now only the specification (HTTP paths, parameters, payload, …). All tutorials were moved (along with being updated) to the Grid'5000's wiki.

The new API specification can be viewed with two tools: The first one allows to read the specification and find information ; the second one allows to discover the API thanks to a playground.

Please note that the specification may contain errors. Please report any of such errors to Support Staff.

-- Grid'5000 Team 14:30, January 11th 2021 (CET)


Read more news

Grid'5000 sites

Current funding

As from June 2008, Inria is the main contributor to Grid'5000 funding.

INRIA

Logo INRIA.gif

CNRS

CNRS-filaire-Quadri.png

Universities

Université Grenoble Alpes, Grenoble INP
Université Rennes 1, Rennes
Institut National Polytechnique de Toulouse / INSA / FERIA / Université Paul Sabatier, Toulouse
Université Bordeaux 1, Bordeaux
Université Lille 1, Lille
École Normale Supérieure, Lyon

Regional councils

Aquitaine
Auvergne-Rhône-Alpes
Bretagne
Champagne-Ardenne
Provence Alpes Côte d'Azur
Hauts de France
Lorraine