Grid'5000 is a large-scale and flexible testbed for experiment-driven research in all areas of computer science, with a focus on parallel and distributed computing including Cloud, HPC and Big Data and AI.
- provides access to a large amount of resources: 15000 cores, 800 compute-nodes grouped in homogeneous clusters, and featuring various technologies: PMEM, GPU, SSD, NVMe, 10G and 25G Ethernet, Infiniband, Omni-Path
- highly reconfigurable and controllable: researchers can experiment with a fully customized software stack thanks to bare-metal deployment features, and can isolate their experiment at the networking layer
- advanced monitoring and measurement features for traces collection of networking and power consumption, providing a deep understanding of experiments
- designed to support Open Science and reproducible research, with full traceability of infrastructure and software changes on the testbed
- a vibrant community of 500+ users supported by a solid technical team
Read more about our teams, our publications, and the usage policy of the testbed. Then get an account, and learn how to use the testbed with our Getting Started tutorial and the rest of our Users portal.
Grid'5000 is merging with FIT to build the SILECS Infrastructure for Large-scale Experimental Computer Science. Read an Introduction to SILECS (April 2018)
Recently published documents and presentations:
Grid'5000 is supported by a scientific interest group (GIS) hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations. Inria has been supporting Grid'5000 through ADT ALADDIN-G5K (2007-2013), ADT LAPLACE (2014-2016), and IPL HEMERA (2010-2014).
Current status (at 2021-12-08 18:07)
: 1 current events, 2 planned (details)
Random pick of publications
Five random publications that benefited from Grid'5000 (at least 2177 overall):
- Jean Luca Bez, Alberto Miranda, Ramon Nou, Francieli Zanon Boito, Toni Cortes, et al.. Arbitration Policies for On-Demand User-Level I/O Forwarding on HPC Platforms. IPDPS 2021 - 35th IEEE International Parallel and Distributed Processing Symposium, May 2021, Portland, Oregon / Virtual, United States. hal-03149582 view on HAL pdf
- Adrien Faure, Giorgio Lucarelli, Olivier Richard, Denis Trystram. Online Scheduling with Redirection for Parallel Jobs. IPDPSW 2020 - IEEE International Parallel and Distributed Processing Symposium Workshops, May 2020, New Orleans, France. pp.1-4, 10.1109/IPDPSW50202.2020.00066. hal-02944032 view on HAL pdf
- Sara Dahmani, Vincent Colotte, Slim Ouni. Étude comparative des paramètres d'entrée pour la synthèse expressive audiovisuelle de la parole par DNNs. 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 1 : Journées d'Études sur la Parole, Jun 2020, Nancy, France. pp.127-135. hal-02798526v3 view on HAL pdf
- Anne-Cécile Orgerie. From Understanding to Greening the Energy Consumption of Distributed Systems. Networking and Internet Architecture cs.NI. Ecole Normale Supérieure de Rennes, 2020. tel-03158492 view on HAL pdf
- Simon Delamare, Lucas Nussbaum. Kwollect: Metrics Collection for Experiments at Scale. CNERT 2021 - Workshop on Computer and Networking Experimental Research using Testbeds, May 2021, Virtual, United States. pp.1-6. hal-03236421 view on HAL pdf
Debian 11 "Bullseye" std environment is now the default environment on nodes
We are pleased to announce that Debian 11 (Bullseye) std environment (debian11-std) is now the default environment on nodes. See `kaenv3 -l debian11%` for an exhaustive list of available variants.
New features and changes in Debian 11 are described in: https://www.debian.org/releases/bullseye/amd64/release-notes/ch-whats-new.en.html .
In particular, it includes many software updates:
- Cuda 11.2.2 / Nvidia Drivers 460.91.03
- OpenJDK 17
- Python 3.9.2
- python3-numpy 1.19.5
- python3-scipy 1.6.0
- python3-pandas 1.1.5
- Perl 5.32.1
- GCC 9.3 and 10.2
- G++ 10.2
- Libboost 1.74.0
- Ruby 2.7.4
- CMake 3.18.4
- GFortran 10.2.1
- Liblapack 3.9.0
- libatlas 3.10.3
- RDMA 33.2
- OpenMPI 4.1.0
Some additional important changes have to be noted:
- Python 2 is not included as it is deprecated since Jan. 2020, and `/usr/bin/python` is symlinked to `/usr/bin/python3`. Note that Python2 packages are still available on repositories and that you can install the 'python-is-python2' package to change the symlink.
- The OpenMPI backend used for Infiniband and Omnipath networks is UCX (this is similar to mainstream OpenMPI, but differs from the default behaviour in Debian 11).
- The libfabric package has been recompiled to disable a buggy EFA provider (AWS)¹.
- Apache MXNet deep learning framework is no more officially supported by the team, as his documentation.
- As Ganglia monitoring has been replaced by Kwollect², Ganglia services have been removed from environments.
- The BeeGFS client is not operational³, which impacts the grcinq and grvingt clusters. As a workaround, users can deplo...
Kadeploy: Machine's architecture added to environments description
Kadeploy environments now require the architecture information in a new field named “arch” in the environment’s descriptions.
Current valid architectures on Grid’5000 are: x86_64, ppc64le (drac cluster in Grenoble), aarch64 (pyxis cluster in Lyon).
We have already added the architecture information in every recorded environment (kaenv database). If you deploy from an environment description (-a option of kadeploy), you will need to add this new field yourself. Please contact email@example.com in case of issues.
If multiple environments share the same name but have different architectures, Kadeploy will automatically select the appropriate architecture for a given cluster.
Taking advantage of this new feature, the architecture code names will soon be removed from the Grid’5000 reference environments names. For instance, debian11-x64-big will become debian11-big: “x64” disappears but “x86_64” is set in the environment architecture tag.
-- Grid'5000 Team 11:38, November 18th 2021 (CET)
Cluster "neowise" from AMD is available in the default queue
We have the pleasure to announce that the "neowise" cluster is now available at Lyon in the default queue. It features 10 nodes, each including an AMD EPYC 7642 CPU with 48 cores, 8 AMD Radeon Instinct MI50 GPUs (32 GB), 512GB of RAM and an Infiniband network.
Remind that this cluster is tagged as "exotic", so the `-t exotic` option must be provided to oarsub to select neowise.
The cluster has been donated by AMD to Genci and Inria, in particular to support research against COVID-19. We would like to thank AMD for this donation and Genci for their successful collaboration in making this machine available in Grid'5000.
-- Grid'5000 Team 13:20, November 9th 2021 (CET)
Read more news
As from June 2008, Inria is the main contributor to Grid'5000 funding.
Université Grenoble Alpes, Grenoble INP
Université Rennes 1, Rennes
Institut National Polytechnique de Toulouse / INSA / FERIA / Université Paul Sabatier, Toulouse
Université Bordeaux 1, Bordeaux
Université Lille 1, Lille
École Normale Supérieure, Lyon
Provence Alpes Côte d'Azur
Hauts de France