Grid5000:Home

From Grid5000
Revision as of 14:30, 12 February 2021 by Bjonglez (talk | contribs) (Fix news link)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Grid'5000

Grid'5000 is a large-scale and flexible testbed for experiment-driven research in all areas of computer science, with a focus on parallel and distributed computing including Cloud, HPC and Big Data and AI.

Key features:

  • provides access to a large amount of resources: 15000 cores, 800 compute-nodes grouped in homogeneous clusters, and featuring various technologies: PMEM, GPU, SSD, NVMe, 10G and 25G Ethernet, Infiniband, Omni-Path
  • highly reconfigurable and controllable: researchers can experiment with a fully customized software stack thanks to bare-metal deployment features, and can isolate their experiment at the networking layer
  • advanced monitoring and measurement features for traces collection of networking and power consumption, providing a deep understanding of experiments
  • designed to support Open Science and reproducible research, with full traceability of infrastructure and software changes on the testbed
  • a vibrant community of 500+ users supported by a solid technical team


Read more about our teams, our publications, and the usage policy of the testbed. Then get an account, and learn how to use the testbed with our Getting Started tutorial and the rest of our Users portal.

Grid'5000 is merging with FIT to build the SILECS Infrastructure for Large-scale Experimental Computer Science. Read an Introduction to SILECS (April 2018)


Recently published documents and presentations:

Older documents:


Grid'5000 is supported by a scientific interest group (GIS) hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations. Inria has been supporting Grid'5000 through ADT ALADDIN-G5K (2007-2013), ADT LAPLACE (2014-2016), and IPL HEMERA (2010-2014).


Current status (at 2021-08-06 06:29): 2 current events, None planned (details)


Random pick of publications

Five random publications that benefited from Grid'5000 (at least 2182 overall):

  • Paul Zimmermann. Accuracy of Mathematical Functions in Single, Double, Extended Double and Quadruple Precision. 2021. hal-03141101 view on HAL pdf
  • Lakhdar Meftah. Towards Privacy-sensitive Mobile Crowdsourcing. Software Engineering cs.SE. University of Lille, 2019. English. tel-02399716v2 view on HAL pdf
  • Esteban Marquer. LatticeNN -- Deep Learning and Formal Concept Analysis. Artificial Intelligence cs.AI. 2020. hal-02954843 view on HAL pdf
  • Dimitri Saingre, Thomas Ledoux, Jean-Marc Menaud. BCTMark: a framework for benchmarking blockchain technologies. AICCSA 2020 - 17th IEEE/ACS International Conference on Computer Systems and Applications, Nov 2020, Antalya, Turkey. pp.1-8, 10.1109/AICCSA50499.2020.9316536. hal-02923038 view on HAL pdf
  • Brij Mohan Lal Srivastava, Natalia Tomashenko, Xin Wang, Emmanuel Vincent, Junichi Yamagishi, et al.. Design Choices for X-vector Based Speaker Anonymization. INTERSPEECH 2020, International Speech Communication Association (ISCA), Oct 2020, Shanghai, China. hal-02610447v2 view on HAL pdf


Latest news

Rss.svgGrid'5000 metadata bundler is available in alpha version

Dear users,

When running experiments on Grid'5000, users generate metadata across multiple services.

The extraction and the storage of this metadata is useful for matters of scientific data management and reproducibility. We are currently developing a tool designed to collect this metadata concerning an experiment from the different Grid'5000 services and bundle it in a single compressed archive. Making it easier to find, study, and store information about your experiments.

The g5k-metadata-bundler is now available in alpha version on all Grid'5000 node frontends.

At this point in time, the bundles generated by g5k-metadata-bundler contains:

  • the OAR information of a single job,
  • copies from the specifications of the nodes involved,
  • all monitoring information collected by kwollect during the job.

As the software is still in alpha version, the contents and structure of the bundle is subject to changes and a list of planned features is available on the dedicated wiki page.

We are also interested in user feedback on the support mailing list.

-- Grid'5000 Team 14:30, July 22nd 2021 (CEST)

Rss.svgNew AMD cluster named "neowise" available in Lyon for testing

Dear users,

A new cluster, named "neowise" is available in Lyon for testing.

This machine is a donation from AMD to Genci and Inria to support the French research community (in particular for works in relation with COVID19).

The cluster has 10 nodes, each including an AMD EPYC 7642 48-Core processor, 512GB of RAM, 8 Radeon MI50 GPUs, and an HDR Infiniband network (200Gbs). Its full characteristics are described at: https://www.grid5000.fr/w/Lyon:Hardware#neowise

The cluster is still in testing phase and few issues are known:

- Few nodes currently have problems and will be unavailable until fixed - The software stack to use AMD GPUs is incomplete. The HIP compiler is included in default environment but many libraries or software (such as Deep Learning frameworks) are lacking. They will be added soon (mostly as "environment modules") - Some Grid'5000 "advanced" features are missing. An overview of what is working correctly is available at: https://intranet.grid5000.fr/jenkins-status?config=neowise

The neowise cluster is tagged as "exotic" and is currently available in "testing" queue. To submit a job, don't forget to add the appropriate options to oarsub. For instance:

$ oarsub -q testing -t exotic -p "cluster = 'neowise'" -I

We would like to thank AMD for this donation and Genci for their successful collaboration in making this machine available in Grid'5000.

-- Grid'5000 Team 14:30, June 24th 2021 (CEST)

Rss.svgDebian 11 "Bullseye" preview environments are now available for deployments

Debian 11 stable (Bullseye) will be released in a few weeks. We are pleased to offer a "preview" of kadeploy environments for Debian 11 (currently still Debian testing), that you can deploy already. See the debian11 environments in kaenv3.

New features and changes in Debian 11 are described in: https://www.debian.org/releases/bullseye/amd64/release-notes/ch-whats-new.en.html .

In particular, it includes many software updates:

  • Cuda 11.2 / Nvidia drivers 460.73.01
  • OpenJDK 17
  • Python 3.9.2
    • python3-numpy 1.19.5
    • python3-scipy 1.6.0
    • python3-pandas 1.1.5

(Note that Python 2 is not included as deprecated since Jan. 2020 and /usr/bin/python is symlinked to /usr/bin/python3)

  • Perl 5.32.1
  • GCC 9.3 and 10.2
  • G++ 10.2
  • Libboost 1.74.0
  • Ruby 2.7.3
  • CMake 3.18.4
  • GFortran 10.2.1
  • Liblapack 3.9.0
  • libatlas 3.10.3
  • RDMA 33.1
  • OpenMPI 4.1.0

Known regressions and problems are :

  • The std environment is not ready yet, and will not be the default Grid'5000 environment until official Debian 11 Bullseye release.
  • Cuda/Nvidia drivers do not support - out-of-the-box or at all - some quite old GPUs.
  • BeeGFS is not operational at the moment.

Let us know if you want us to support some tools, softwares,… that are not available on big images.

As a reminder, you can use the following commands to deploy an environment on nodes (https://www.grid5000.fr/w/Getting_Started#Deploying_nodes_with_Kadeploy):

 $ oarsub -t deploy -I
 $ kadeploy3 -e debian11-x64-big...

Rss.svgKadeploy: use of UUID partition identifiers and faster deployments

Up to now, kadeploy was identifying disk partitions with their block device names (e.g. /dev/sda3) when deploying a system. This no longer works reliably because of disk inversion issues with recent kernels. As a result, we have changed kadeploy to use filesystem UUIDs instead.

This change affects the root partition passed to the kernel command line as well as the generated /etc/fstab file on the system.

If you want to keep identifying the partitions using block device names, you can use the "--bootloader no-uuid" and the "--fstab no-uuid" options of g5k-postinstall, in the postinstalls/script of the description of your environment. Please refer to the "Customizing the postinstalls" section of the "Advanced Kadeploy" page: Advanced_Kadeploy#Using_g5k-postinstall

As an additional change, Kadeploy now tries to use kexec more often, which should make the first deployment of a job noticeably faster.

-- Grid'5000 Team 17:30, June 9th 2021 (CEST)


Read more news

Grid'5000 sites

Current funding

As from June 2008, Inria is the main contributor to Grid'5000 funding.

INRIA

Logo INRIA.gif

CNRS

CNRS-filaire-Quadri.png

Universities

Université Grenoble Alpes, Grenoble INP
Université Rennes 1, Rennes
Institut National Polytechnique de Toulouse / INSA / FERIA / Université Paul Sabatier, Toulouse
Université Bordeaux 1, Bordeaux
Université Lille 1, Lille
École Normale Supérieure, Lyon

Regional councils

Aquitaine
Auvergne-Rhône-Alpes
Bretagne
Champagne-Ardenne
Provence Alpes Côte d'Azur
Hauts de France
Lorraine