Grid5000:Home

From Grid5000
Revision as of 14:30, 12 February 2021 by Bjonglez (talk | contribs) (Fix news link)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Grid'5000

Grid'5000 is a large-scale and flexible testbed for experiment-driven research in all areas of computer science, with a focus on parallel and distributed computing including Cloud, HPC and Big Data and AI.

Key features:

  • provides access to a large amount of resources: 15000 cores, 800 compute-nodes grouped in homogeneous clusters, and featuring various technologies: PMEM, GPU, SSD, NVMe, 10G and 25G Ethernet, Infiniband, Omni-Path
  • highly reconfigurable and controllable: researchers can experiment with a fully customized software stack thanks to bare-metal deployment features, and can isolate their experiment at the networking layer
  • advanced monitoring and measurement features for traces collection of networking and power consumption, providing a deep understanding of experiments
  • designed to support Open Science and reproducible research, with full traceability of infrastructure and software changes on the testbed
  • a vibrant community of 500+ users supported by a solid technical team


Read more about our teams, our publications, and the usage policy of the testbed. Then get an account, and learn how to use the testbed with our Getting Started tutorial and the rest of our Users portal.

Grid'5000 is merging with FIT to build the SILECS Infrastructure for Large-scale Experimental Computer Science. Read an Introduction to SILECS (April 2018)


Recently published documents and presentations:

Older documents:


Grid'5000 is supported by a scientific interest group (GIS) hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations. Inria has been supporting Grid'5000 through ADT ALADDIN-G5K (2007-2013), ADT LAPLACE (2014-2016), and IPL HEMERA (2010-2014).


Current status (at 2021-10-23 05:05): 3 current events, 2 planned (details)


Random pick of publications

Five random publications that benefited from Grid'5000 (at least 2178 overall):

  • Duong Nguyen, Matthieu Simonin, Guillaume Hajduch, Rodolphe Vadaine, Cédric Tedeschi, et al.. Detection of Abnormal Vessel Behaviors from AIS data using GeoTrackNet: from the Laboratory to the Ocean. MBDW 2020 : 2nd Maritime Big Data Workshop part of MDM 2020 : 21st IEEE International Conference on Mobile Data Management, Jun 2020, Versailles, France. pp.264-268, 10.1109/MDM48529.2020.00061. hal-02523279v2 view on HAL pdf
  • Lars-Dominik Braun, Ludovic Courtès, Pjotr Prins, Simon Tournier, Ricardo Wurmus. Guix-HPC Activity Report 2019–2020. Technical Report Inria Bordeaux Sud-Ouest; ZPID. 2020. hal-03136575 view on HAL pdf
  • Gabrielle de Micheli, Rémi Piau, Cécile Pierrot. A Tale of Three Signatures: practical attack of ECDSA with wNAF. AFRICACRYPT 2020, Jul 2020, Cairo, Egypt. pp.361-381, 10.1007/978-3-030-51938-4_18. hal-02393302v2 view on HAL pdf
  • José Jurandir Alves Esteves, Amina Boubendir, Fabice Guillemin, Pierre Sens. Location-based Data Model for Optimized Network Slice Placement. NetSoft 2020 - 6th IEEE International Conference on Network Softwarization, Jun 2020, Ghent / Virtual, Belgium. pp.404-412, 10.1109/NetSoft48620.2020.9165427. hal-02981095 view on HAL pdf
  • Alessio Pagliari, Fabrice Huet, Guillaume Urvoy-Keller. Towards a High-Level Description for Generating Stream Processing Benchmark Applications. The Third IEEE International Workshop on Benchmarking, Performance Tuning and Optimization for Big Data Applications (BPOD 2019), Dec 2019, Los Angeles, United States. hal-02371215 view on HAL pdf


Latest news

Rss.svgDebian 11 "Bullseye" environments are ready for deployments

We are pleased to announce that Debian 11 (Bullseye) environments are now the default environments supported for deployments in Grid'5000. See `kaenv3 -l debian11%` for an exhaustive list of available variants.

The default environment available on nodes will remain the same (debian10-std) for some time to come (see below).

New features and changes in Debian 11 are described in: https://www.debian.org/releases/bullseye/amd64/release-notes/ch-whats-new.en.html .

In particular, it includes many software updates:

  • Cuda 11.2.2 / Nvidia Drivers 460.91.03
  • OpenJDK 17
  • Python 3.9.2
    • python3-numpy 1.19.5
    • python3-scipy 1.6.0
    • python3-pandas 1.1.5
  • Perl 5.32.1
  • GCC 9.3 and 10.2
  • G++ 10.2
  • Libboost 1.74.0
  • Ruby 2.7.4
  • CMake 3.18.4
  • GFortran 10.2.1
  • Liblapack 3.9.0
  • libatlas 3.10.3
  • RDMA 33.2
  • OpenMPI 4.1.0

Some additional important changes have to be noted:

  • Python 2 is not included as it is deprecated since Jan. 2020, and `/usr/bin/python` is symlinked to `/usr/bin/python3`. Note that Python2 packages are still available on repositories and that you can install the 'python-is-python2' package to change the symlink.
  • OpenMPI
    • The OpenMPI backend used for Infiniband and Omnipath networks is UCX (this is similar to mainstream OpenMPI, but differs from the default behaviour in Debian 11).
    • The libfabric package has been recompiled to disable a buggy EFA provider (AWS)¹.
  • As Ganglia monitoring has been replaced by Kwollect², Ganglia services have been removed from environments.
  • The BeeGFS client is not operational³, which impacts the grcinq and grvingt clusters. As a workar...

Rss.svgReorganisation of Taurus and Orion clusters' hard disks

Dear Grid'5000 users,

Due to an increasing number of malfunctionning disks, the hardware of Taurus and Orion clusters in Lyon is going to change.

Currently, nodes on this cluster use 2 physical disks to provide a virtual disk space of 558GB (using a RAID-0). From Monday, October 18th, only a single disk will be kept: nodes' disk space will be reduced to 279GB and I/O performance will also be degraded.

However, thanks to this operation, more Taurus and Orion nodes will be available.

The operation is planned for Monday 18th October, nodes will be inaccessible for the day.

Best regards,

-- Grid'5000 Team 16:45, October 13th 2021 (CEST)

Rss.svgBehavior change about disks identification

Since Linux 5.3[1], SCSI device probing is no longer deterministic, which means that it is no longer possible to rely on a block device name (sda, sdb, and so on) to correctly identify a disk on Grid'5000 nodes. In practice, a given disk that is called sda at one point might be renamed to sdb after a reboot.

As Debian 11 is using Linux in version 5.10, some changes are required to handle this issue. Grid'5000's internal tooling and the reference repository have been updated to work without the block devices' name. Those modifications are transparent for Grid'5000's users.

An arbitrary id was introduced for each disk in the reference repository, in the form of diskN (disk0 being the primary disk for a node). Those can be found on each site's hardware page and in the description of each node in the reference API. The device key (containing the block device's name) was removed from each node description.

To ease disks manipulation on nodes, symbolic links have been added in the /dev/ directory: each disk id (e.g: /dev/disk2) points to the correct block device. Similarly, links have been added for partitions: for instance, /dev/disk2p1 points to the first partition of disk2. It is recommended to use these paths from now on.

Those symbolic links are managed by some udev rules which are deployed with g5k-postinstall using the new option '--disk-aliases'. Therefore, the links can also be added inside custom environments by passing this option in the kaenv description (at the postinstall level).

The documentation[2] about disk reservation is also up to date on how to identify and use the reserved disk(s).

1: https://lore.kernel.org/lkml/59eedd28-25d4-7899-7c3c-89fe7fdd4b43@acm.org/t/

2: https://www.grid50...

Rss.svgMajor update of BIOS and other firmwares

Since this summer, we have performed a campaign of firmware updates (BIOS, Network interface Cards, RAID adapters…) of the nodes of most Grid'5000 clusters.

These updates may provide better hardware reliability or mitigations for security issues.

However those changes may have an impact on your experiments (particularly in terms of performance). This is a difficult issue where there is no good solution, as it is often hard or impossible to downgrade BIOS versions.

Remind that firmware versions are published in the reference API. We recommend that you use this information to track down changes that could affect your experiments.

For instance, in https://api.grid5000.fr/stable/sites/nancy/clusters/gros/nodes/gros-1.json?pretty=1 , see bios.version and firmware_version. You can also browse previous versions using the API ¹, or using the GitHub commit history ².

We will continue to update such firmware in the future (always synchronized for the same cluster and documented in the reference API).

¹: https://api.grid5000.fr/doc/3.0/reference/spec.html#get-3-0-item-uri-versions

²: https://github.com/grid5000/reference-repository/commits/master

-- Grid'5000 Team 16:55, October 12th 2021 (CEST)




Read more news

Grid'5000 sites

Current funding

As from June 2008, Inria is the main contributor to Grid'5000 funding.

INRIA

Logo INRIA.gif

CNRS

CNRS-filaire-Quadri.png

Universities

Université Grenoble Alpes, Grenoble INP
Université Rennes 1, Rennes
Institut National Polytechnique de Toulouse / INSA / FERIA / Université Paul Sabatier, Toulouse
Université Bordeaux 1, Bordeaux
Université Lille 1, Lille
École Normale Supérieure, Lyon

Regional councils

Aquitaine
Auvergne-Rhône-Alpes
Bretagne
Champagne-Ardenne
Provence Alpes Côte d'Azur
Hauts de France
Lorraine