Grid5000:Home

From Grid5000
Revision as of 14:30, 12 February 2021 by Bjonglez (talk | contribs) (Fix news link)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Grid'5000

Grid'5000 is a large-scale and flexible testbed for experiment-driven research in all areas of computer science, with a focus on parallel and distributed computing including Cloud, HPC and Big Data and AI.

Key features:

  • provides access to a large amount of resources: 15000 cores, 800 compute-nodes grouped in homogeneous clusters, and featuring various technologies: PMEM, GPU, SSD, NVMe, 10G and 25G Ethernet, Infiniband, Omni-Path
  • highly reconfigurable and controllable: researchers can experiment with a fully customized software stack thanks to bare-metal deployment features, and can isolate their experiment at the networking layer
  • advanced monitoring and measurement features for traces collection of networking and power consumption, providing a deep understanding of experiments
  • designed to support Open Science and reproducible research, with full traceability of infrastructure and software changes on the testbed
  • a vibrant community of 500+ users supported by a solid technical team


Read more about our teams, our publications, and the usage policy of the testbed. Then get an account, and learn how to use the testbed with our Getting Started tutorial and the rest of our Users portal.

Grid'5000 is merging with FIT to build the SILECS Infrastructure for Large-scale Experimental Computer Science. Read an Introduction to SILECS (April 2018)


Recently published documents and presentations:

Older documents:


Grid'5000 is supported by a scientific interest group (GIS) hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations. Inria has been supporting Grid'5000 through ADT ALADDIN-G5K (2007-2013), ADT LAPLACE (2014-2016), and IPL HEMERA (2010-2014).


Current status (at 2022-10-05 18:22): 1 current events, 1 planned (details)


Random pick of publications

Five random publications that benefited from Grid'5000 (at least 2383 overall):

  • Nathan Grinsztajn, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist. There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning. Neural Information Processing Systems (2021), Dec 2021, Virtual, France. hal-03454640 view on HAL pdf
  • Tulika Bose, Irina Illina, Dominique Fohr. Unsupervised Domain Adaptation in Cross-corpora Abusive Language Detection. SocialNLP 2021 - The 9th International Workshop on Natural Language Processing for Social Media, Jun 2021, Virtual, France. hal-03204605 view on HAL pdf
  • Maxime Gobert, Jan Gmys, Nouredine Melab, Daniel Tuyttens. Space Partitioning with multiple models for Parallel Bayesian Optimization. OLA 2021 - Optimization and Learning Algorithm, Jun 2021, Sicilia / Virtual, Italy. hal-03324642 view on HAL pdf
  • Pierre Augier, Carl Friedrich Bolz-Tereick, Serge Guelton, Ashwin Vishnu Mohanan. Reducing the ecological impact of computing through education and Python compilers. Nature Astronomy, Nature Publishing Group, 2021, 5 (4), pp.334-335. 10.1038/s41550-021-01342-y. hal-03432227 view on HAL pdf
  • Shakeel Sheikh, Md Sahidullah, Fabrice Hirsch, Slim Ouni. Robust Stuttering Detection via Multi-task and Adversarial Learning. EUSIPCO 2022 - 30th European Signal Processing Conference, Aug 2022, Belgrade, Serbia. hal-03629785 view on HAL pdf


Latest news

Rss.svgNew documentation for the FPGA nodes and wattmeter update in Grenoble

Documentation is now available at FPGA for the use of the FPGA nodes (servan cluster in Grenoble).

Furthermore, the Servan nodes' power supply units are now monitored by the Grenoble wattmeter and available in Kwollect.

Finally, due to the unreliability of the measures, Dahu is not monitored by the wattmeter anymore. Past problems with Yeti and Troll should now be fixed.

-- Grid'5000 Team 14:20, Sept 15th 2022 (CEST)

Rss.svgNew cluster “servan” available in Grenoble, featuring FPGAs

We have the pleasure to announce a new cluster named “servan” available in Grenoble. It is composed of 2 Dell R7525 nodes, each equipped with two AMD EPYC 7352 24-Cores CPUs, 128GB of DDR4 RAM, two 1.6TB NVMe SSD, and a 25Gbps Ethernet.

Additionally, each node features a Xilinx Alveo U200¹ FPGA with two 100Gbps Ethernet ports connected to the Grid'5000 network.

For now, no support is provided by the Grid'5000 technical team to exploit those FPGAs. But please let us know if you plan to try and use those FPGAs.

This cluster has been funded by CNRS/INS2I to support the Slices project.

Warning: debian11-xen² is currently unavailable on servan due to a problem with the network cards.

¹ https://www.xilinx.com/products/boards-and-kits/alveo/u200.html

² https://intranet.grid5000.fr/bugzilla/show_bug.cgi?id=13923

-- Grid'5000 Team 14:40, June 30th 2022 (CEST)

Rss.svgShorter command usage for Kadeploy tools

We changed the syntax of the Kadeploy commands. A typical command line for deployment was:

 kadeploy3 -f $OAR_NODE_FILE -e debian11-x64-base -k
  • The -e flag can now be omitted when deploying a recorded environment, while when deploying an anonymous deployment, the -a flag must still be set.
  • The public key is copied by default, -k can be omitted.
  • Nodes to be deployed are taken from $OAR_NODE_FILE if not specified on the command line with -f or -m.

As a result, the same deployment can be achieved with the following shorter command line:

 kadeploy3 debian11-x64-base

Moreover, the architectures are removed from the Grid'5000 reference environments names (eg: "debian11-x64-base" and "debian11-ppc64-base" both become "debian11-base"). Kadeploy infers the correct environment (CPU architecture) for the cluster to deploy. As a result, the previous kadeploy command line can be shortened to:

 kadeploy3 debian11-base

Old syntaxes and environment names including the architecture are however still supported for backward compatibility.

You can find more information on the Getting Started and Advanced Kadeploy pages.

-- Grid'5000 Team 10:50, Wed, May 19th 2022 (CEST)

Rss.svgCluster "sirius" from Nvidia is available in the default queue

We have the pleasure to announce that a new cluster named "sirius" is available at Lyon.

This cluster consists in only one Nvidia DGX A100 node with 2 AMD EPYC 7742 CPUs (64 cores per CPU) and 8 Nvidia A100 (40 GiB) GPUs, 1TB of DDR4 RAM, 2x1.92 TB SSD + 4x3.84 TB SSD disks and a coming-soon infiniband network.

Energy monitoring is available for this cluster, provided by the same wattmeter devices as used for the other clusters in Lyon.

This cluster is tagged as "exotic", so the `-t exotic` option must be provided to oarsub to select sirius.

This machine has been funded by LIP laboratory with the support of INRIA Grenoble Rhone-Alpes and ENS Lyon.

-- Grid'5000 Team 11:30, Wed, May 11th 2022 (CEST)


Read more news

Grid'5000 sites

Current funding

As from June 2008, Inria is the main contributor to Grid'5000 funding.

INRIA

Logo INRIA.gif

CNRS

CNRS-filaire-Quadri.png

Universities

Université Grenoble Alpes, Grenoble INP
Université Rennes 1, Rennes
Institut National Polytechnique de Toulouse / INSA / FERIA / Université Paul Sabatier, Toulouse
Université Bordeaux 1, Bordeaux
Université Lille 1, Lille
École Normale Supérieure, Lyon

Regional councils

Aquitaine
Auvergne-Rhône-Alpes
Bretagne
Champagne-Ardenne
Provence Alpes Côte d'Azur
Hauts de France
Lorraine