Grid5000:Home

From Grid5000
Jump to navigation Jump to search
Grid'5000

Grid'5000 is a large-scale and flexible testbed for experiment-driven research in all areas of computer science, with a focus on parallel and distributed computing including Cloud, HPC and Big Data and AI.

Key features:

  • provides access to a large amount of resources: 15000 cores, 800 compute-nodes grouped in homogeneous clusters, and featuring various technologies: PMEM, GPU, SSD, NVMe, 10G and 25G Ethernet, Infiniband, Omni-Path
  • highly reconfigurable and controllable: researchers can experiment with a fully customized software stack thanks to bare-metal deployment features, and can isolate their experiment at the networking layer
  • advanced monitoring and measurement features for traces collection of networking and power consumption, providing a deep understanding of experiments
  • designed to support Open Science and reproducible research, with full traceability of infrastructure and software changes on the testbed
  • a vibrant community of 500+ users supported by a solid technical team


Read more about our teams, our publications, and the usage policy of the testbed. Then get an account, and learn how to use the testbed with our Getting Started tutorial and the rest of our Users portal.

Grid'5000 is merging with FIT to build the SILECS Infrastructure for Large-scale Experimental Computer Science. Read an Introduction to SILECS (April 2018)


Recently published documents and presentations:

Older documents:


Grid'5000 is supported by a scientific interest group (GIS) hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations. Inria has been supporting Grid'5000 through ADT ALADDIN-G5K (2007-2013), ADT LAPLACE (2014-2016), and IPL HEMERA (2010-2014).


Current status (at 2025-05-22 04:57): 4 current events, 3 planned (details)


Random pick of publications

Five random publications that benefited from Grid'5000 (at least 2758 overall):

  • Alexandre Sabbadin, Abdel Kader Chabi Sika Boni, Hassan Hassan, Khalil Drira. Optimizing network slice placement using Deep Reinforcement Learning (DRL) on a real platform operated by Open Source MANO (OSM). Tunisian-Algerian Conference on Applied Computing (TACC 2023), Nov 2023, Sousse, Tunisia. hal-04265140 view on HAL pdf
  • Nadia Skifa, Fadil Boodoo, Carole Delenne, Renaud Hostache, Morgan Abily. Impact of training dataset size and its hydrometeorological typology on LSTM performance for rainfall-runoff modeling: a case study of the Severn river. SimHydro conference 2023, Société Hydrotechnique de France (SHF); the Association Française de Mécanique (AFM); the Environmental & Water Resources Institute (EWRI); the International Association for Hydro-Environment Engineering and Research (IAHR), Nov 2023, Chatou, France. pp.107-128, 10.1007/978-981-97-4076-5. hal-04375806 view on HAL pdf
  • Mostafa Sadeghi, Romain Serizel. Posterior sampling algorithms for unsupervised speech enhancement with recurrent variational autoencoder. International Conference on Acoustics Speech and Signal Processing (ICASSP), IEEE, Apr 2024, Seoul (Korea), South Korea. 10.48550/arXiv.2309.10439. hal-04210679v2 view on HAL pdf
  • Alaaeddine Chaoub. Deep learning representations for prognostics and health management. Computer Science cs. Université de Lorraine, 2024. English. NNT : 2024LORR0057. tel-04687618 view on HAL pdf
  • Daniel Balouek. Performance-cost trade-offs in service orchestration for edge computing. SSDBM 2024 - 36th International Conference on Scientific and Statistical Database Management, Edge Computing; Resource Management; Computing Continuum; Trade-offs; Urgent Computing, Jul 2024, Rennes, France. pp.1-4, 10.1145/3676288.3676307. hal-04775133 view on HAL pdf


Latest news

Rss.svgCluster "estats" (Jetson nodes in Toulouse) is now kavlan capable

The network topology of the estats Jetson nodes can now be configured, just like for other clusters.

More info in the Network reconfiguration tutorial.

-- Grid'5000 Team 18:25, 21 May 2025 (CEST)

Rss.svgCluster "chirop" is now in the default queue of Lille with energy monitoring.

Dear users,

We are pleased to announce that the Chirop[1] cluster of Lille is now available in the default queue.

This cluster consists of 5 HPE DL360 Gen10+ nodes with:

  • 2 CPU Intel Xeon Platinum 8358 (32 cores per CPU)
  • 512 GiB memory
  • 1*1.92TB SSD NVME + 2*3.84TB SSD
  • 2*25 Gbps Ethernet interface
  • Energy monitoring[2] is also available for this cluster[3], provided by newly installed Wattmetres (similar to those already available at Lyon).

    This cluster was funded by CPER CornelIA.

    [1] https://www.grid5000.fr/w/Lille:Hardware#chirop

    [2] https://www.grid5000.fr/w/Energy_consumption_monitoring_tutorial [3] https://www.grid5000.fr/w/Monitoring_Using_Kwollect#Metrics_available_in_Grid.275000

    -- Grid'5000 Team 16:25, 05 May 2025 (CEST)

    Rss.svgChange of default queue based on platform

    Until now, Abaca (production) users had to specify `-q production` when reserving Abaca resources with OAR.

    This is no longer necessary as your default queue is now automatically selected based on the platform your default group is associated to, as shown at https://api.grid5000.fr/explorer/selector/ and in the message displayed when connecting to a frontend.

    For SLICES-FR users, there is no change since the correct queue was already selected by default.

    Additionally, the "production" queue has been renamed to "abaca", although "production" will continue to work for the foreseeable future.

    Please note one case where this change may affect your workflow:

    When an Abaca user reserves a resource from SLICES-FR (a non-production resource), they must explicitly specify they want to use the SLICES-FR queue, which is called "default", by adding `-q default` the OAR command.

    -- Abaca Grid'5000 Team 10:10, 31 March 2025 (CEST)

    Rss.svgCluster "musa" with Nvidia H100 GPUs is available in production queue

    We are pleased to announce that a new cluster named "musa" is available in the production queue¹ of Abaca.

    This cluster has been funded by Inria DSI as a shared computing resource.

    It is accessible to all Abaca users. Users affiliated with Inria have access with the same level of priority, regardless of the research center to which they are attached.

    This cluster is composed of six HPE Proliant DL385 Gen11 nodes² with 2 AMD EPYC 9254 24-Core Processor, 512 GiB of RAM, 2 x Nvidia H100 NVL (94 GiB) with NVLink, one 6 TB SSD NVME and 25 Gbps Ethernet Connexion

    Please note that in order to share it efficiently, walltime is limited:

  • 6 hours for the first two nodes
  • 24 hours for the next two
  • 48 hours for the last two
  • The cluster "musa" is located at Sophia, hosted in the datacenter of Inria Centre at Université Côte d’Azur.

    ¹: https://api.grid5000.fr/explorer/hardware/sophia/#musa

    ²: the nodes are named musa-1, musa-2,.., musa-6

    -- Grid'5000 Team 13:30, 19 March 2025 (CEST)


    Read more news

    Grid'5000 sites

    Current funding

    As from June 2008, Inria is the main contributor to Grid'5000 funding.

    INRIA

    Logo INRIA.gif

    CNRS

    CNRS-filaire-Quadri.png

    Universities

    IMT Atlantique
    Université Grenoble Alpes, Grenoble INP
    Université Rennes 1, Rennes
    Institut National Polytechnique de Toulouse / INSA / FERIA / Université Paul Sabatier, Toulouse
    Université Bordeaux 1, Bordeaux
    Université Lille 1, Lille
    École Normale Supérieure, Lyon

    Regional councils

    Aquitaine
    Auvergne-Rhône-Alpes
    Bretagne
    Champagne-Ardenne
    Provence Alpes Côte d'Azur
    Hauts de France
    Lorraine