Difference between revisions of "Grid5000:Home"

From Grid5000
Jump to: navigation, search
Line 7: Line 7:
  
 
Key features:
 
Key features:
* provides '''access to a large amount of resources''': 1000 nodes, 8000 cores, grouped in homogeneous clusters, and featuring various technologies: 10G Ethernet, Infiniband, Omnipath, GPUs, Xeon PHI
+
* provides '''access to a large amount of resources''': 12000 cores, 800 compute-nodes grouped in homogeneous clusters, and featuring various technologies: GPU, SSD, NVMe, 10G Ethernet, Infiniband, Omnipath
 
* '''highly reconfigurable and controllable''': researchers can experiment with a fully customized software stack thanks to bare-metal deployment features, and can isolate their experiment at the networking layer
 
* '''highly reconfigurable and controllable''': researchers can experiment with a fully customized software stack thanks to bare-metal deployment features, and can isolate their experiment at the networking layer
 
* '''advanced monitoring and measurement features for traces collection of networking and power consumption''', providing a deep understanding of experiments
 
* '''advanced monitoring and measurement features for traces collection of networking and power consumption''', providing a deep understanding of experiments

Revision as of 08:23, 18 September 2018

Grid'5000

Grid'5000 is a large-scale and versatile testbed for experiment-driven research in all areas of computer science, with a focus on parallel and distributed computing including Cloud, HPC and Big Data.

Key features:

  • provides access to a large amount of resources: 12000 cores, 800 compute-nodes grouped in homogeneous clusters, and featuring various technologies: GPU, SSD, NVMe, 10G Ethernet, Infiniband, Omnipath
  • highly reconfigurable and controllable: researchers can experiment with a fully customized software stack thanks to bare-metal deployment features, and can isolate their experiment at the networking layer
  • advanced monitoring and measurement features for traces collection of networking and power consumption, providing a deep understanding of experiments
  • designed to support Open Science and reproducible research, with full traceability of infrastructure and software changes on the testbed
  • a vibrant community of 500+ users supported by a solid technical team


Read more about our teams, our publications, and the usage policy of the testbed. Then get an account, and learn how to use the testbed with our Getting Started tutorial and the rest of our Users portal.


Recently published documents and presentations:

Older documents:


Grid'5000 is supported by a scientific interest group (GIS) hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations. Inria has been supporting Grid'5000 through ADT ALADDIN-G5K (2007-2013), ADT LAPLACE (2014-2016), and IPL HEMERA (2010-2014).


Current status (at 2020-11-30 20:42): 3 current events, 7 planned (details)


Random pick of publications

Five random publications that benefited from Grid'5000 (at least 2057 overall):

  • Bastien Confais, Adrien Lebre, Benoît Parrein. A Fog storage software architecture for the Internet of Things. Advances in Edge Computing: Massive Parallel Processing and Applications, IOS Press, pp.61-105, 2020, Advances in Parallel Computing, 978-1-64368-062-0. 10.3233/APC200004. hal-02496105 view on HAL pdf
  • Nouredine Melab, Jan Gmys, Mohand Mezmaz, Daniel Tuyttens. Many-core Branch-and-Bound for GPU accelerators and MIC coprocessors. T. Bartz-Beielstein; B. Filipic; P. Korosec; E-G. Talbi. High-Performance Simulation-Based Optimization, 833, Springer, pp.16, 2019, Studies in Computational Intelligence, ISBN 978-3-030-18763-7. hal-01924766 view on HAL pdf
  • Baptiste Jonglez, Sinan Birbalta, Martin Heusse. Persistent DNS connections for improved performance. NETWORKING 2019 - IFIP Networking 2019, May 2019, Warsaw, Poland. pp.1. hal-02149975 view on HAL pdf
  • Jean Luca Bez, Francieli Zanon Boito, Ramon Nou, Alberto Miranda, Toni Cortes, et al.. Adaptive Request Scheduling for the I/O Forwarding Layer using Reinforcement Learning. Future Generation Computer Systems, Elsevier, 2020, 112, pp.1156-1169. 10.1016/j.future.2020.05.005. hal-01994677v3 view on HAL pdf
  • Abdulqawi Saif, Alexandre Merlin, Olivier Dautricourt, Maël Houbre, Lucas Nussbaum, et al.. Emulation of Storage Performance in Testbed Experiments with Distem. CNERT 2019 - IEEE INFOCOM International Workshop on Computer and Networking Experimental Research using Testbeds, Apr 2019, Paris, France. pp.6. hal-02078301 view on HAL pdf


Latest news

Rss.svgNew IBM POWER8 cluster "drac" available for beta testing in Grenoble

We are happy to announce that a new cluster "drac" is available in the testing queue in Grenoble.

The cluster has 12 "Minsky" nodes from IBM, more precisely "Power Systems S822LC for HPC".

Each node has 2x10 POWER8 cores, 4 Tesla P100 GPU, and 128 GB of RAM. The GPUs are directly interconnected with a NVLINK fabric. Support for Infiniband 100G and kavlan is planned to be added soon.

This is the first cluster in Grid'5000 using the POWER architecture, and we are also aware of some stability issues related to GPUs, hence the "beta testing" status for now.

Feedback is highly welcome on performance and stability, as well as on our software environment for this new architecture.

Note that, since the CPU architecture is not x86, you need to explicitly ask for an "exotic" job type when reserving the nodes. For instance, to get a single GPU:

grenoble> oarsub -I -q testing -t exotic -l gpu=1

More information on the hardware is available at Grenoble:Hardware

Acknowledgment: This cluster was donated by GENCI, many thanks to them. It was formerly known as the Ouessant platform of Genci's Cellule de Veille Technologique.

-- Grid'5000 Team 15:50, November 26th 2020 (CET)

Rss.svgOAR job container feature re-activated

We have the pleasure to announce that the OAR job container feature has been re-activated. It allows to execute inner jobs within the boundaries of a container job.

It can be used, for example, by a professor to reserve resources with a container job for a teaching lab, and let students run their own jobs inside that container.

More informations on job containers here

Please note that if a single user needs to run multiple tasks (e.g. SPMD) within a bigger resource reservation, it is preferable to use a tool such as GNU Parallel: GNU Parallel

-- Grid'5000 Team 15:40, November 19th 2020 (CET)

Rss.svgGrid'5000 global vlans and stitching now available through the Fed4FIRE federation

We announced in June that Grid'5000 nodes were now available through the Grid'5000 Aggregate Manager to users of the Fed4FIRE testbed Federation.

Fed4FIRE is the largest federation worldwide of Next Generation Internet (NGI) testbeds, which provide open, accessible and reliable facilities supporting a wide variety of different research and innovation communities and initiatives in Europe.

We now have the pleasure to announce that the Aggregate Manager now allows the reservation of Grid'5000 global-vlans and inter-testbed stitching throught the federation's jFed-Experiment application, and tools designed to access GENI testbeds.

Inter-testbed stitching allows users to link Grid'5000 global-vlans to external vlans linked to other testbeds in the federation. Using this technology user can setup experiments around wide-area l2 networks.

Grid’5000 users wanting to use the aggregate manager should link their Fed4FIRE account to their Grid’5000 one using the process described in the wiki.

-- Grid'5000 Team 10:00, November 16th 2020 (CET)

Rss.svgNew cluster grappe available in the production queue

We have the pleasure to announce that a new cluster named "grappe" is available at Nancy¹.

We chose that name to celebrate the famous alsatian wines, as the network part of grappe was wrongly delivered in Strasbourg. We interpreted that funny event as a kind of "birth origin".

The cluster features 16 Dell R640 nodes with 2 Intel® Xeon® Gold 5218R (20 cores/CPU), 96 GB of RAM, a 480 GB SSD + a 8.0 TB HDD reservable disk and 25 Gbps Ethernet.

Energy monitoring² is available for the cluster.

The cluster has been funded by the CPER IT2MP (Contrat Plan État Région, Innovations Technologiques, Modélisation & Médecine Personnalisée) and FEDER (Fonds européen de développement régional)³.

¹: https://www.grid5000.fr/w/Nancy:Hardware

²: https://www.grid5000.fr/w/Energy_consumption_monitoring_tutorial

³: https://www.loria.fr/fr/la-recherche/axes-scientifiques-transverses/projets-sante-numerique/

-- Grid'5000 Team 14:30, October 16th 2020 (CET)


Read more news

Grid'5000 sites

Current funding

As from June 2008, Inria is the main contributor to Grid'5000 funding.

INRIA

Logo INRIA.gif

CNRS

CNRS-filaire-Quadri.png

Universities

Université Grenoble Alpes, Grenoble INP
Université Rennes 1, Rennes
Institut National Polytechnique de Toulouse / INSA / FERIA / Université Paul Sabatier, Toulouse
Université Bordeaux 1, Bordeaux
Université Lille 1, Lille
École Normale Supérieure, Lyon

Regional councils

Aquitaine
Bretagne
Champagne-Ardenne
Provence Alpes Côte d'Azur
Hauts de France
Lorraine