Grid5000:Home

From Grid5000
Revision as of 14:30, 12 February 2021 by Bjonglez (talk | contribs) (Fix news link)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Grid'5000

Grid'5000 is a large-scale and flexible testbed for experiment-driven research in all areas of computer science, with a focus on parallel and distributed computing including Cloud, HPC and Big Data and AI.

Key features:

  • provides access to a large amount of resources: 15000 cores, 800 compute-nodes grouped in homogeneous clusters, and featuring various technologies: PMEM, GPU, SSD, NVMe, 10G and 25G Ethernet, Infiniband, Omni-Path
  • highly reconfigurable and controllable: researchers can experiment with a fully customized software stack thanks to bare-metal deployment features, and can isolate their experiment at the networking layer
  • advanced monitoring and measurement features for traces collection of networking and power consumption, providing a deep understanding of experiments
  • designed to support Open Science and reproducible research, with full traceability of infrastructure and software changes on the testbed
  • a vibrant community of 500+ users supported by a solid technical team


Read more about our teams, our publications, and the usage policy of the testbed. Then get an account, and learn how to use the testbed with our Getting Started tutorial and the rest of our Users portal.

Grid'5000 is merging with FIT to build the SILECS Infrastructure for Large-scale Experimental Computer Science. Read an Introduction to SILECS (April 2018)


Recently published documents and presentations:

Older documents:


Grid'5000 is supported by a scientific interest group (GIS) hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations. Inria has been supporting Grid'5000 through ADT ALADDIN-G5K (2007-2013), ADT LAPLACE (2014-2016), and IPL HEMERA (2010-2014).


Current status (at 2021-06-20 03:48): 1 current events, None planned (details)


Random pick of publications

Five random publications that benefited from Grid'5000 (at least 2181 overall):

  • Emile Cadorel, Hélène Coullon, Jean-Marc Menaud. A workflow scheduling deadline-based heuristic for energy optimization in Cloud. GreenCom 2019 : 15th IEEE International Conference on Green Computing and Communications, Jul 2019, Atlanta, United States. pp.1-10, 10.1109/iThings/GreenCom/CPSCom/SmartData.2019.00135. hal-02165835 view on HAL pdf
  • Gustavo Rostirolla. Scheduling in cloud data center powered by renewable energy only with mixed phases-based workload. Databases cs.DB. Université Paul Sabatier - Toulouse III, 2019. English. NNT : 2019TOU30160. tel-02628518 view on HAL pdf
  • Fabrice Boudot, Pierrick Gaudry, Aurore Guillevic, Nadia Heninger, Emmanuel Thomé, et al.. Nouveaux records de factorisation et de calcul de logarithme discret. Techniques de l'Ingenieur, Techniques de l'ingénieur, 2020, pp.17. hal-03045666 view on HAL pdf
  • Farah Ait Salaht, Frédéric Desprez, Adrien Lebre. An overview of service placement problem in Fog and Edge Computing. Research Report RR-9295, Univ Lyon, EnsL, UCBL, CNRS, Inria, LIP, LYON, France. 2019, pp.1-43. hal-02313711v2 view on HAL pdf
  • Charles Bouillaguet, Paul Zimmermann. Parallel Structured Gaussian Elimination for the Number Field Sieve. Mathematical Cryptology, Florida Online Journals, In press. hal-02098114v2 view on HAL pdf


Latest news

Rss.svgDebian 11 "Bullseye" preview environments are now available for deployments

Debian 11 stable (Bullseye) will be released in a few weeks. We are pleased to offer a "preview" of kadeploy environments for Debian 11 (currently still Debian testing), that you can deploy already. See the debian11 environments in kaenv3.

New features and changes in Debian 11 are described in: https://www.debian.org/releases/bullseye/amd64/release-notes/ch-whats-new.en.html .

In particular, it includes many software updates:

  • Cuda 11.2 / Nvidia drivers 460.73.01
  • OpenJDK 17
  • Python 3.9.2
    • python3-numpy 1.19.5
    • python3-scipy 1.6.0
    • python3-pandas 1.1.5

(Note that Python 2 is not included as deprecated since Jan. 2020 and /usr/bin/python is symlinked to /usr/bin/python3)

  • Perl 5.32.1
  • GCC 9.3 and 10.2
  • G++ 10.2
  • Libboost 1.74.0
  • Ruby 2.7.3
  • CMake 3.18.4
  • GFortran 10.2.1
  • Liblapack 3.9.0
  • libatlas 3.10.3
  • RDMA 33.1
  • OpenMPI 4.1.0

Known regressions and problems are :

  • The std environment is not ready yet, and will not be the default Grid'5000 environment until official Debian 11 Bullseye release.
  • Cuda/Nvidia drivers do not support - out-of-the-box or at all - some quite old GPUs.
  • BeeGFS is not operational at the moment.

Let us know if you want us to support some tools, softwares,… that are not available on big images.

As a reminder, you can use the following commands to deploy an environment on nodes (https://www.grid5000.fr/w/Getting_Started#Deploying_nodes_with_Kadeploy):

 $ oarsub -t deploy -I
 $ kadeploy3 -e debian11-x64-big...

Rss.svgKadeploy: use of UUID partition identifiers and faster deployments

Up to now, kadeploy was identifying disk partitions with their block device names (e.g. /dev/sda3) when deploying a system. This no longer works reliably because of disk inversion issues with recent kernels. As a result, we have changed kadeploy to use filesystem UUIDs instead.

This change affects the root partition passed to the kernel command line as well as the generated /etc/fstab file on the system.

If you want to keep identifying the partitions using block device names, you can use the "--bootloader no-uuid" and the "--fstab no-uuid" options of g5k-postinstall, in the postinstalls/script of the description of your environment. Please refer to the "Customizing the postinstalls" section of the "Advanced Kadeploy" page: Advanced_Kadeploy#Using_g5k-postinstall

As an additional change, Kadeploy now tries to use kexec more often, which should make the first deployment of a job noticeably faster.

-- Grid'5000 Team 17:30, June 9th 2021 (CEST)

Rss.svgNew monitoring service with Kwollect is now stable

The new Grid'5000 monitoring service based on Kwollect is now stable. Kwollect will now serve requests adressed to Grid'5000 "Metrology API", i.e., from this URL:

https://api.grid5000.fr/stable/sites/SITE/metrics

The former API based on Ganglia is no longer available, and Ganglia will be removed from Grid'5000 environments starting from the next Debian version (debian11).

Main features of Kwollect are:

  • Focus on "environmental" monitoring, i.e. metrics not available from inside the nodes: electrical consumption, temperature, metrics from network equipments or nodes' BMC… but Kwollect also monitors node's metrics from Prometheus exporters
  • Support for Grid'5000 wattmeters at high frequency
  • On-demand activation of optional metrics
  • Custom metrics can be pushed by users
  • Grafana-based vizualisation

Its usage of this new service is described at: Monitoring_Using_Kwollect

Here are the main other changes since the last annoucement on December:

  • Monitoring related entries in Reference API have been cleaned (nodes' "sensors" and "wattmetre" keys removed, "pdu" moved at top level), OAR property wattmeter=SHARED and wattmeter=MULTIPLE removed
  • Visualization now uses Grafana
  • Metrics naming and polling period have been updated
  • Grid'5000 documentation has been updated
  • Bugs and performance fixes

-- Grid'5000 Team 12:00, May 21th 2021 (CEST)

Rss.svgUpgrade of the graffiti-13 node in Nancy

We are happy to announce that the graffiti-13 node on the Nancy site in production queue has been upgraded:

its 4 GeForce RTX 2080 Ti 11GB have been replaced by 4 RTX 6000 24GB GDDR6[1].

Reservation example : oarsub -q production -p "host='graffiti-13.nancy.grid5000.fr'" -I

[1] https://www.nvidia.com/fr-fr/design-visualization/quadro/rtx-6000/

-- Grid'5000 Team 17:00, May 6th 2021 (CEST)


Read more news

Grid'5000 sites

Current funding

As from June 2008, Inria is the main contributor to Grid'5000 funding.

INRIA

Logo INRIA.gif

CNRS

CNRS-filaire-Quadri.png

Universities

Université Grenoble Alpes, Grenoble INP
Université Rennes 1, Rennes
Institut National Polytechnique de Toulouse / INSA / FERIA / Université Paul Sabatier, Toulouse
Université Bordeaux 1, Bordeaux
Université Lille 1, Lille
École Normale Supérieure, Lyon

Regional councils

Aquitaine
Auvergne-Rhône-Alpes
Bretagne
Champagne-Ardenne
Provence Alpes Côte d'Azur
Hauts de France
Lorraine