Grid5000:Home: Difference between revisions

From Grid5000
Jump to navigation Jump to search
(move old news)
No edit summary
 
(63 intermediate revisions by 9 users not shown)
Line 3: Line 3:
|- valign="top"
|- valign="top"
|bgcolor="#f5fff5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
|bgcolor="#f5fff5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[[Image:Logo.png|left]]
[[Image:g5k-backbone.png|thumbnail|260px|right|Grid'5000]]
'''Grid'5000 is a large-scale and flexible testbed for experiment-driven research in all areas of computer science, with a focus on parallel and distributed computing including Cloud, HPC and Big Data and AI.'''
 
Key features:
* provides '''access to a large amount of resources''': 15000 cores, 800 compute-nodes grouped in homogeneous clusters, and featuring various technologies: PMEM, GPU, SSD, NVMe, 10G and 25G Ethernet, Infiniband, Omni-Path
* '''highly reconfigurable and controllable''': researchers can experiment with a fully customized software stack thanks to bare-metal deployment features, and can isolate their experiment at the networking layer
* '''advanced monitoring and measurement features for traces collection of networking and power consumption''', providing a deep understanding of experiments
* '''designed to support Open Science and reproducible research''', with full traceability of infrastructure and software changes on the testbed
* '''a vibrant community''' of 500+ users supported by a solid technical team
 
<br>
<br>
''Grid'5000 is a scientific instrument supporting experiment-driven research in all areas of computer science, including high performance computing, distributed computing, networking and big data.'' <br>
Read more about our [[Team|teams]], our [[Publications|publications]], and the [[Grid5000:UsagePolicy|usage policy]] of the testbed. Then [[Grid5000:Get_an_account|get an account]], and learn how to use the testbed with our [[Getting_Started|Getting Started tutorial]] and the rest of our [[:Category:Portal:User|Users portal]].
[[media:seminaire_intro.pdf|Download a general introduction]], or a [https://www.grid5000.fr/screencast/index.html screencast of recent webUI developments]
 
|}
<b>Grid'5000 is merging with [https://fit-equipex.fr FIT] to build the [http://www.silecs.net/ SILECS Infrastructure for Large-scale Experimental Computer Science]. Read [http://www.silecs.net/wp-content/uploads/2018/04/Desprez-SILECS.pdf an Introduction to SILECS] (April 2018)</b>
{{#status:0|0|0|http://bugzilla.grid5000.fr/status/upcoming.json}}
== Latest updates from Grid'5000 users ==
* '''Publications'''
{{#publications:3||**}}
==Latest news==
=== [http://www.lifl.fr/~melab/HTML/Journee-G5K-Lille.htm Grid'5000 tutorial days in Lille] ===
We are happy to let you known that tutorials around Grid'5000 will be organized in Lille on November 20th, 2014, with a few seats available for people outside Lille. All information on the [http://www.lifl.fr/~melab/HTML/Journee-G5K-Lille.htm dedicated web page].
=== [[Grid5000:School2014|Grid'5000 spring school]] now finished ===
The Grid'5000 spring school took place between June 16th, 2014 and June 19th, 2014 in Lyon. Three awards were given for presentation or challenge entries (the challenge entries ended as a tie):
{|width="75%" cellspacing="3"
|- valign="top"
|width="33%" bgcolor="#f5f5f5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[[Image:Presentation_Gistau_award.png|252px|Best presentation award to Miguel Liroz Gistau]]


Best presentation award to Miguel Liroz Gistau, Reza Akbarinia and Patrick Valduriez
<br>
|width="33%" bgcolor="#f5f5f5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
Recently published documents and presentations:
[[Image:Challenge_Buchert_award.png|252px|Best challenge entry to Tomasz Buchert, Emmanuel Jeanvoine and Lucas Nussbaum]]
* [[Media:Grid5000.pdf|Presentation of Grid'5000]] (April 2019)
* [https://www.grid5000.fr/mediawiki/images/Grid5000_science-advisory-board_report_2018.pdf Report from the Grid'5000 Science Advisory Board (2018)]


Best challenge entry to Tomasz Buchert, Emmanuel Jeanvoine and Lucas Nussbaum
Older documents:
|width="33%" bgcolor="#f5f5f5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
* [https://www.grid5000.fr/slides/2014-09-24-Cluster2014-KeynoteFD-v2.pdf Slides from Frederic Desprez's keynote at IEEE CLUSTER 2014]
[[Image:Challenge_Pastor_award.png|252px|Best challenge entry to Jonathan Pastor and Laurent Pouilloux]]
* [https://www.grid5000.fr/ScientificCommittee/SAB%20report%20final%20short.pdf Report from the Grid'5000 Science Advisory Board (2014)]


Best challenge entry to Jonathan Pastor and Laurent Pouilloux
<br>
|-
Grid'5000 is supported by a scientific interest group (GIS) hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations. Inria has been supporting Grid'5000 through ADT ALADDIN-G5K (2007-2013), ADT LAPLACE (2014-2016), and IPL [[Hemera|HEMERA]] (2010-2014).
|}
|}
----
[[Grid5000:News|read more news]]


<br>
<br>
==Grid'5000 at a glance==
{{#status:0|0|0|http://bugzilla.grid5000.fr/status/upcoming.json}}
<!-- [[Image:site_map.png|thumbnail|128px|right|Grid'5000 sites]] -->
<br>
[[Image:renater5-g5k.jpg|thumbnail|200px|right|Grid'5000]]


== Random pick of publications ==
{{#publications:}}


* '''Grid'5000''' is a scientific instrument for the study of large scale parallel and distributed systems. It aims at providing a '''highly reconfigurable, controlable and monitorable experimental platform''' to its users. The initial aim (circa 2003) was to reach 5000 processors in the platform. It has been reframed at 5000 cores, and was reached during winter 2008-2009.
==Latest news==
* The infrastructure of Grid'5000 is geographically distributed on different sites hosting the instrument, initially 9 sites in France (10 since 2011). Porto Alegre, Brazil is now officially becoming the first site abroad.
<rss max=4 item-max-length="2000">https://www.grid5000.fr/rss/G5KNews.php</rss>
----
[[News|Read more news]]


===Sites:===
=== Grid'5000 sites===
{|width="75%" cellspacing="3"  
{|width="100%" cellspacing="3"  
|- valign="top"
|- valign="top"
|width="33%" bgcolor="#f5f5f5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
|width="33%" bgcolor="#f5f5f5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
* [[Bordeaux:Home|Bordeaux]]
* [[Grenoble:Home|Grenoble]]
* [[Grenoble:Home|Grenoble]]
* [[Lille:Home|Lille]]
* [[Lille:Home|Lille]]
Line 57: Line 54:
* [[Nancy:Home|Nancy]]
* [[Nancy:Home|Nancy]]
* [[Nantes:Home|Nantes]]
* [[Nantes:Home|Nantes]]
* [[Reims:Home|Reims]]
|width="33%" bgcolor="#f5f5f5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
|width="33%" bgcolor="#f5f5f5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
* [[Rennes:Home|Rennes]]
* [[Rennes:Home|Rennes]]
Line 64: Line 60:
|-
|-
|}
|}
[[Image:Software layers.png|thumbnail|271px|left|Grid'5000 allows experiments in all these software layers]]
* '''Grid'5000''' is a research effort developing a '''large scale nation wide infrastructure for large scale parallel and distributed computing research'''.
* '''19 [[Grid5000:Laboratories|laboratories]]''' are involved in France with the objective of providing the community a testbed allowing experiments in all the software layers between the network protocols up to the applications.
The current plans are to extend from the 9 initial sites each with 100 to a thousand PCs, connected by the [http://www.renater.fr RENATER] Education and Research Network to a bigger platform including a few sites outside France not necessarily connected through a dedicated network connection. Sites in Brazil and Luxembourg should join shortly, and Reims has now joined.
All sites in France are connected to [http://www.renater.fr RENATER] with a 10Gb/s link, except Reims, for the time linked through a 1Gb/s
This high collaborative research effort is funded by INRIA, CNRS, the Universities of all sites and some regional councils.
== ALADDIN-G5K : ensuring the development of '''Grid'5000''' ==
For the 2008-2012 period, Engineers ensuring the development and day to day support of the infrastructure are mostly provided by INRIA, under the ''ADT ALADDIN-G5K''  initiative.
==[[Hemera|HEMERA: Demonstrating ambitious up-scaling techniques on '''Grid'5000''']] ==
[[Hemera|Héméra]] is an INRIA Large Wingspan project, started in 2010, that aims at demonstrating ambitious up-scaling techniques for large scale distributed computing by carrying out several dimensioning experiments on the Grid’5000 infrastructure, at animating the scientific community around Grid’5000 and at enlarging the Grid’5000 community by helping newcomers to make use of Grid’5000.
== Initial Rationale==
'''The foundations of Grid'5000''' have emerged from a thorough analysis and numerous discussions about methodologies used for scientific research in the Grid domain. A report presents the [http://www-sop.inria.fr/aci/grid/public/Library/rapport-grid5000-V3.pdf rationale for Grid'5000].
In addition to theory, simulators and emulators, there is a strong need for '''large scale testbeds''' where real life experimental conditions hold. '''The size of Grid'5000''', in terms of number of sites and number of processors per site, was established according to the scale of the experiments and the number of researchers involved in the project.


== Current funding ==
== Current funding ==
As from June 2008, INRIA is the main contributor to [[Grid5000:Funding|Grid'5000 funding]].  
As from June 2008, Inria is the main contributor to [[Grid5000:Funding|Grid'5000 funding]].  
{|width="100%" cellspacing="3"
{|width="100%" cellspacing="3"
|-
|-
Line 103: Line 70:
| width="50%" bgcolor="#f5f5f5" valign="top" align="center" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
| width="50%" bgcolor="#f5f5f5" valign="top" align="center" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
===CNRS===
===CNRS===
[[Image:CNRS-filaire-MonoBleu.gif|100px]]
[[Image:CNRS-filaire-Quadri.png|125px]]
|-
|-
| width="50%" bgcolor="#f5f5f5" valign="top" align="center" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
| width="50%" bgcolor="#f5f5f5" valign="top" align="center" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
===Universities===
===Universities===
University Joseph Fourier, Grenoble<br/>
IMT Atlantique<br/>
University of Rennes 1, Rennes<br/>
Université Grenoble Alpes, Grenoble INP<br/>
Université Rennes 1, Rennes<br/>
Institut National Polytechnique de Toulouse / INSA / FERIA / Université Paul Sabatier, Toulouse<br/>
Institut National Polytechnique de Toulouse / INSA / FERIA / Université Paul Sabatier, Toulouse<br/>
University Bordeaux 1, Bordeaux<br/>
Université Bordeaux 1, Bordeaux<br/>
University Lille 1, Lille<br/>
Université Lille 1, Lille<br/>
Ecole Normale Supérieure, Lyon<br/>
École Normale Supérieure, Lyon<br/>
Université de Reims Champagne-Ardenne, Reims<br/>
| width="50%" bgcolor="#f5f5f5" valign="top" align="center" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
| width="50%" bgcolor="#f5f5f5" valign="top" align="center" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
===Regional councils===
===Regional councils===
Aquitaine<br/>
Aquitaine<br/>
Auvergne-Rhône-Alpes<br/>
Bretagne<br/>
Bretagne<br/>
Champagne-Ardenne<br/>
Champagne-Ardenne<br/>
Provence Alpes Côte d'Azur<br/>
Provence Alpes Côte d'Azur<br/>
Nord Pas de Calais<br/>
Hauts de France<br/>
Lorraine<br/>
Lorraine<br/>
|}
|}

Latest revision as of 10:29, 26 October 2023

Grid'5000

Grid'5000 is a large-scale and flexible testbed for experiment-driven research in all areas of computer science, with a focus on parallel and distributed computing including Cloud, HPC and Big Data and AI.

Key features:

  • provides access to a large amount of resources: 15000 cores, 800 compute-nodes grouped in homogeneous clusters, and featuring various technologies: PMEM, GPU, SSD, NVMe, 10G and 25G Ethernet, Infiniband, Omni-Path
  • highly reconfigurable and controllable: researchers can experiment with a fully customized software stack thanks to bare-metal deployment features, and can isolate their experiment at the networking layer
  • advanced monitoring and measurement features for traces collection of networking and power consumption, providing a deep understanding of experiments
  • designed to support Open Science and reproducible research, with full traceability of infrastructure and software changes on the testbed
  • a vibrant community of 500+ users supported by a solid technical team


Read more about our teams, our publications, and the usage policy of the testbed. Then get an account, and learn how to use the testbed with our Getting Started tutorial and the rest of our Users portal.

Grid'5000 is merging with FIT to build the SILECS Infrastructure for Large-scale Experimental Computer Science. Read an Introduction to SILECS (April 2018)


Recently published documents and presentations:

Older documents:


Grid'5000 is supported by a scientific interest group (GIS) hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations. Inria has been supporting Grid'5000 through ADT ALADDIN-G5K (2007-2013), ADT LAPLACE (2014-2016), and IPL HEMERA (2010-2014).


Current status (at 2025-05-04 15:09): 1 current events, 7 planned (details)


Random pick of publications

Five random publications that benefited from Grid'5000 (at least 2758 overall):

  • Tom Hubrecht, Claude-Pierre Jeannerod, Paul Zimmermann. Towards a correctly-rounded and fast power function in binary64 arithmetic. 2023 IEEE 30th Symposium on Computer Arithmetic (ARITH 2023), Sep 2023, Portland, Oregon (USA), United States. hal-04326201 view on HAL pdf
  • Gautier Evennou, Vivien Chappelier, Ewa Kijak, Teddy Furon. SWIFT: Semantic Watermarking for Image Forgery Thwarting. WIFS 2024 - 16th IEEE International Workshop on Information Forensics and Security, IEEE, Dec 2024, Roma, Italy. pp.1-6. hal-04728070 view on HAL pdf
  • Barbara Gendron, Gaël Guibon. Context-Aware Siamese Networks for Efficient Emotion Recognition in Conversation. 2024. hal-04532408 view on HAL pdf
  • Sandipana Dowerah. Deep Learning-based Speaker Verification In Real Conditions. Computer Science cs. Université de Lorraine, 2023. English. NNT : 2023LORR0046. tel-04257423 view on HAL pdf
  • Imran Ahamad Sheikh, Emmanuel Vincent, Irina Illina. Training RNN Language Models on Uncertain ASR Hypotheses in Limited Data Scenarios. Computer Speech and Language, 2024, 83, pp.101555. 10.1016/j.csl.2023.101555. hal-03327306v2 view on HAL pdf


Latest news

Rss.svgChange of default queue based on platform

Until now, Abaca (production) users had to specify `-q production` when reserving Abaca resources with OAR.

This is no longer necessary as your default queue is now automatically selected based on the platform your default group is associated to, as shown at https://api.grid5000.fr/explorer/selector/ and in the message displayed when connecting to a frontend.

For SLICES-FR users, there is no change since the correct queue was already selected by default.

Additionally, the "production" queue has been renamed to "abaca", although "production" will continue to work for the foreseeable future.

Please note one case where this change may affect your workflow:

When an Abaca user reserves a resource from SLICES-FR (a non-production resource), they must explicitly specify they want to use the SLICES-FR queue, which is called "default", by adding `-q default` the OAR command.

-- Abaca Grid'5000 Team 10:10, 31 March 2025 (CEST)

Rss.svgCluster "musa" with Nvidia H100 GPUs is available in production queue

We are pleased to announce that a new cluster named "musa" is available in the production queue¹ of Abaca.

This cluster has been funded by Inria DSI as a shared computing resource.

It is accessible to all Abaca users. Users affiliated with Inria have access with the same level of priority, regardless of the research center to which they are attached.

This cluster is composed of six HPE Proliant DL385 Gen11 nodes² with 2 AMD EPYC 9254 24-Core Processor, 512 GiB of RAM, 2 x Nvidia H100 NVL (94 GiB) with NVLink, one 6 TB SSD NVME and 25 Gbps Ethernet Connexion

Please note that in order to share it efficiently, walltime is limited:

  • 6 hours for the first two nodes
  • 24 hours for the next two
  • 48 hours for the last two
  • The cluster "musa" is located at Sophia, hosted in the datacenter of Inria Centre at Université Côte d’Azur.

    ¹: https://api.grid5000.fr/explorer/hardware/sophia/#musa

    ²: the nodes are named musa-1, musa-2,.., musa-6

    -- Grid'5000 Team 13:30, 19 March 2025 (CEST)

    Rss.svgCluster "Hydra" is now in the testing queue in Lyon

    We are pleased to announce that the hydra[1] cluster of Lyon is now available in the testing queue.

    Hydra is a cluster composed of 4 NVIDIA Grace-Hopper servers[2].

    Each node features:

  • 1 Nvidia Grace ARM64 CPU with 72 cores (Neoverse-V2)
  • 1 Nvidia Hopper GPU
  • 512GB LPDDR5 memory
  • 96GB HBM memory
  • 1x1To SSD NVME + 1x1.92To SCSI disk
  • Due to its bleeding edge hardware, usual Grid'5000 environments are not supported by default for this cluster.

    (Hydra requires system environments featuring a Linux kernel >= 6.6). The default system on the hydra nodes is based on Debian11, but **does not provide functional GPU**. However, users may deploy the ubuntugh2404-arm64-big environment, which is similar to official Nvidia image provided for this machine and provides GPU support.

    To submit a job on this cluster, the following command may be used:

    oarsub -q testing -t exotic -p hydra

    This cluster is funded by INRIA and by Laboratoire de l'Informatique du Parallélisme with ENS Lyon support.

    [1] Hydra is the largest of the modern constellations according to Wikipedia: https://en.wikipedia.org/wiki/Hydra_(constellation)

    [2] https://developer.nvidia.com/blog/nvidia-grace-hopper-superchip-architecture-in-depth/

    -- Grid'5000 Team 16:10, 11 March 2025 (CEST)

    Rss.svgNew Access Rules for Production Queue

    We are introducing new access rules for clusters in the production queue. Clusters in this queue are now accessed based on priority levels that reflect their funding sources. Jobs submitted at higher priority levels are scheduled before those at lower levels and may also have longer maximum durations.

    For more detailed information, please visit: https://www.grid5000.fr/w/Production#Using_production_resources . However, an important thing to note is that job submission commands that worked previously will still work after that change (but you might get higher or lower priority depending on your resources selection).

    You can check your priority level for each cluster using https://api.grid5000.fr/explorer . Currently, this tool only displays basic information; however, we plan to add more features soon.

    Please be aware that these changes apply only to the production clusters, which are currently available only in Nancy and Rennes. There are no changes to the "default" queue.

    If you encounter any issues or have feedback regarding this new feature, or if you believe your priority level on specific resources is not adequate, please contact us at <support-staff@lists.grid5000.fr>.

    -- Grid'5000 Team 9:25, June 4th 2024 (CEST)


    Read more news

    Grid'5000 sites

    Current funding

    As from June 2008, Inria is the main contributor to Grid'5000 funding.

    INRIA

    Logo INRIA.gif

    CNRS

    CNRS-filaire-Quadri.png

    Universities

    IMT Atlantique
    Université Grenoble Alpes, Grenoble INP
    Université Rennes 1, Rennes
    Institut National Polytechnique de Toulouse / INSA / FERIA / Université Paul Sabatier, Toulouse
    Université Bordeaux 1, Bordeaux
    Université Lille 1, Lille
    École Normale Supérieure, Lyon

    Regional councils

    Aquitaine
    Auvergne-Rhône-Alpes
    Bretagne
    Champagne-Ardenne
    Provence Alpes Côte d'Azur
    Hauts de France
    Lorraine