Difference between revisions of "Grid5000:Home"

From Grid5000
Jump to: navigation, search
(Add news about CLCAR)
(Fix news link)
 
(136 intermediate revisions by 11 users not shown)
Line 1: Line 1:
 
__NOTOC__ __NOEDITSECTION__
 
__NOTOC__ __NOEDITSECTION__
 
{|width="95%"
 
{|width="95%"
|-
+
|- valign="top"
| width="20%" |
+
|bgcolor="#f5fff5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[[Image:Logo_Aladdin.png|250px]]
+
[[Image:renater5-g5k.jpg|thumbnail|250px|right|Grid'5000]]
|
+
'''Grid'5000 is a large-scale and flexible testbed for experiment-driven research in all areas of computer science, with a focus on parallel and distributed computing including Cloud, HPC and Big Data and AI.'''
= ALADDIN-G5K : ensuring the development of '''Grid'5000''' =
+
 
= for the 2008-2012 period =
+
Key features:
 +
* provides '''access to a large amount of resources''': 15000 cores, 800 compute-nodes grouped in homogeneous clusters, and featuring various technologies: PMEM, GPU, SSD, NVMe, 10G and 25G Ethernet, Infiniband, Omni-Path
 +
* '''highly reconfigurable and controllable''': researchers can experiment with a fully customized software stack thanks to bare-metal deployment features, and can isolate their experiment at the networking layer
 +
* '''advanced monitoring and measurement features for traces collection of networking and power consumption''', providing a deep understanding of experiments
 +
* '''designed to support Open Science and reproducible research''', with full traceability of infrastructure and software changes on the testbed
 +
* '''a vibrant community''' of 500+ users supported by a solid technical team
 +
 
 +
<br>
 +
Read more about our [[Team|teams]], our [[Publications|publications]], and the [[Grid5000:UsagePolicy|usage policy]] of the testbed. Then [[Grid5000:Get_an_account|get an account]], and learn how to use the testbed with our [[Getting_Started|Getting Started tutorial]] and the rest of our [[:Category:Portal:User|Users portal]].
  
''An infrastructure distributed in 9 sites around France, for research in large-scale parallel and distributed systems''
+
<b>Grid'5000 is merging with [https://fit-equipex.fr FIT] to build the [http://www.silecs.net/ SILECS Infrastructure for Large-scale Experimental Computer Science]. Read [http://www.silecs.net/wp-content/uploads/2018/04/Desprez-SILECS.pdf an Introduction to SILECS] (April 2018)</b>
  
Engineers ensuring the development and day to day support of the infrastructure are mostly provided by INRIA, under the ''ADT ALADDIN-G5K''  initiative.
+
<br>
 +
Recently published documents and presentations:
 +
* [[Media:Grid5000.pdf|Presentation of Grid'5000]] (April 2019)
 +
* [https://www.grid5000.fr/mediawiki/images/Grid5000_science-advisory-board_report_2018.pdf Report from the Grid'5000 Science Advisory Board (2018)]
  
 +
Older documents:
 +
* [https://www.grid5000.fr/slides/2014-09-24-Cluster2014-KeynoteFD-v2.pdf Slides from Frederic Desprez's keynote at IEEE CLUSTER 2014]
 +
* [https://www.grid5000.fr/ScientificCommittee/SAB%20report%20final%20short.pdf Report from the Grid'5000 Science Advisory Board (2014)]
 +
 +
<br>
 +
Grid'5000 is supported by a scientific interest group (GIS) hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations. Inria has been supporting Grid'5000 through ADT ALADDIN-G5K (2007-2013), ADT LAPLACE (2014-2016), and IPL [[Hemera|HEMERA]] (2010-2014).
 
|}
 
|}
  
{|width="95%"
+
<br>
|- valign="top"
+
{{#status:0|0|0|http://bugzilla.grid5000.fr/status/upcoming.json}}
|bgcolor="#f5fff5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
+
<br>
==Latest news==
 
[[Image:CLCAR2009.jpg|left|120px]]
 
=== Grid'5000 Tutorial at [http://eventos.saber.ula.ve/eventos/conferenceDisplay.py?confId=56 CLCAR 2009] in Venezuela  ===
 
[https://www.grid5000.fr/mediawiki/index.php/Special:G5KReports?menu=renderreport&username=Ygeorgiou Yiannis Georgiou] will present a Grid5000 tutorial at the 2nd edition of Latin American Conference on High Performance Computing (CLCAR) held in Merida, Venezuela (21-25 September 2009).
 
----
 
=== Latest updated experiment descriptions ===
 
{{#experiments:3}}
 
----
 
=== Latest updated publications ===
 
{{#publications:3}}
 
----
 
=== Best Student Demo Award given to Grid'5000 users at [http://www.sigmetrics.org/sigmetrics/ Sigmetrics/Performance 2009] at Seattle ===
 
This award was given to [https://www.grid5000.fr/mediawiki/index.php/Special:G5KReports?menu=renderreport&username=Ploiseau Patrick Loiseau], [https://www.grid5000.fr/mediawiki/index.php/Special:G5KReports?menu=renderreport&username=rguillie Romaric Guillier], [https://www.grid5000.fr/mediawiki/index.php/Special:G5KReports?menu=renderreport&username=Ogoga Oana Goga], [https://www.grid5000.fr/mediawiki/index.php/Special:G5KReports?menu=renderreport&username=Mimbert Matthieu Imbert], [https://www.grid5000.fr/mediawiki/index.php/Special:G5KReports?menu=renderreport&username=Pgoncalves Paulo Goncalves], [https://www.grid5000.fr/mediawiki/index.php/Special:G5KReports?menu=renderreport&username=Pprimet Pascale Vicat-Blanc Primet] (INRIA, ENS-Lyon, University of Lyon, France) for their demo [http://www.sigmetrics.org/sigmetrics/demo/Demo_Loiseau_et_al.pdf Automated Traffic Measurements and Analysis in Grid'5000] demo at the [http://www.sigmetrics.org/sigmetrics/program_sigmetrics-demo.shtml demo session]
 
----
 
=== Grid'5000 under extension to Porto Alegre, Brazil ===
 
July 21st, 2009, INRIA and UFRGS signed a Memorandum of Understanding marking
 
the cooperation of the two institutes towards the extension of the
 
Grid'5000 platform to Brazil.
 
  
According to the terms of this memo, UFRGS will contribute to Grid'5000
+
== Random pick of publications ==
by operating a local site in Porto Alegre and INRIA will fully
+
{{#publications:}}
integrate this local site into Grid'5000 in order to gain an
 
international scale. Researchers from both partners will therefore gain
 
access to one of the major scientific instruments for the study of
 
large-scale parallel and distributed problems in computer science. With
 
this cooperation, Grid'5000 gives its users the possibility to study the
 
effects of inter-continental networks links on these problems.
 
  
For Grid'5000 users, access to the new site during the integration phase
+
==Latest news==
is possible using portoalegre.grenoble as name of the new site
+
<rss max=4 item-max-length="2000">https://www.grid5000.fr/w?title=News&action=feed&feed=atom</rss>
 
----
 
----
=== Grid'5000 showcased during the Open Days of Bell Labs France ===
+
[[News|Read more news]]
June 2nd, 2009, MEtroflux/NXE tools where shown using Grid'5000 during a week during the Open Days of Bell Labs France, demonstrating the potential of Grid'5000 as a research instrument.
 
----
 
[[Grid5000:News|read more news]]
 
|}
 
  
<br>
+
=== Grid'5000 sites===
==Grid'5000 at a glance==
+
{|width="100%" cellspacing="3"  
[[Image:site_map.png|thumbnail|128px|right|Grid'5000 sites]]
 
* '''Grid'5000''' is a scientific instrument for the study of large scale parallel and distributed systems. It aims at providing a '''highly reconfigurable, controlable and monitorable experimental platform''' to its users. The initial aim (circa 2003) was to reach 5000 processors in the platform. It has been reframed at 5000 cores, and was reached during winter 2008-2009.
 
* The infrastructure of Grid'5000 is geographically distributed on different sites hosting the instrument, initialy 9 in France. Porto Alegre, Brazil is now officially becoming the 10th site.
 
===Sites:===
 
{|width="75%" cellspacing="3"  
 
 
|- valign="top"
 
|- valign="top"
 
|width="33%" bgcolor="#f5f5f5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
 
|width="33%" bgcolor="#f5f5f5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
* [[Bordeaux:Home|Bordeaux]]
 
 
* [[Grenoble:Home|Grenoble]]
 
* [[Grenoble:Home|Grenoble]]
 
* [[Lille:Home|Lille]]
 
* [[Lille:Home|Lille]]
 +
* [[Luxembourg:Home|Luxembourg]]
 
|width="33%" bgcolor="#f5f5f5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
 
|width="33%" bgcolor="#f5f5f5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
 
* [[Lyon:Home|Lyon]]
 
* [[Lyon:Home|Lyon]]
 
* [[Nancy:Home|Nancy]]
 
* [[Nancy:Home|Nancy]]
* [[Orsay:Home|Orsay]]
+
* [[Nantes:Home|Nantes]]
* [[PortoAlegre:Home|Porto Alegre]]
 
 
|width="33%" bgcolor="#f5f5f5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
 
|width="33%" bgcolor="#f5f5f5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
 
* [[Rennes:Home|Rennes]]
 
* [[Rennes:Home|Rennes]]
Line 77: Line 60:
 
|-
 
|-
 
|}
 
|}
 
 
 
[[Image:Software layers.png|thumbnail|271px|left|Grid'5000 allows experiments in all these software layers]]
 
* '''Grid'5000''' is a research effort developping a '''large scale nation wide infrastructure for large scale parallel and distributed computing research'''.
 
 
* '''17 [[Grid5000:Laboratories|laboratories]]''' are involved in France with the objective of providing the community a testbed allowing experiments in all the software layers between the network protocols up to the applications.
 
 
 
 
The current plans are to extend from the 9 initial sites each with 100 to a thousand PCs, connected by the [http://www.renater.fr RENATER] Education and Research Network to a bigger platform including a few sites outside France not necessarily connected through a dedicated network connection. Sites in Brazil and Luxembourg should join shortly.
 
 
All sites in France are connected to [http://www.renater.fr RENATER] with a 10Gb/s link.
 
 
 
 
This high collaborative research effort is funded by INRIA, CNRS, the Universities of all sites and some regional councils.
 
 
 
 
== Initial Rationale==
 
'''The foundations of Grid'5000''' have emerged from a thorough analysis and numerous discussions about methodologies used for scientific research in the Grid domain. A report presents the [http://www-sop.inria.fr/aci/grid/public/Library/rapport-grid5000-V3.pdf rationale for Grid'5000].
 
 
In addition to theory, simulators and emulators, there is a strong need for '''large scale testbeds''' where real life experimental conditions hold. '''The size of Grid'5000''', in terms of number of sites and number of processors per site, was established according to the scale of the experiments and the number of researchers involved in the project.
 
  
 
== Current funding ==
 
== Current funding ==
As from June 2008, INRIA is the main contributor to [[Grid5000:Funding|Grid'5000 funding]].  
+
As from June 2008, Inria is the main contributor to [[Grid5000:Funding|Grid'5000 funding]].  
 
{|width="100%" cellspacing="3"
 
{|width="100%" cellspacing="3"
 
|-
 
|-
 
| width="50%" bgcolor="#f5f5f5" valign="top" align="center" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
 
| width="50%" bgcolor="#f5f5f5" valign="top" align="center" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
 
===INRIA===
 
===INRIA===
[[Image:Logo-inria.png]]
+
[[Image:Logo_INRIA.gif|300px]]
 
| width="50%" bgcolor="#f5f5f5" valign="top" align="center" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
 
| width="50%" bgcolor="#f5f5f5" valign="top" align="center" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
 
===CNRS===
 
===CNRS===
[[Image:Logo-cnrs.png]]
+
[[Image:CNRS-filaire-Quadri.png|125px]]
 
|-
 
|-
 
| width="50%" bgcolor="#f5f5f5" valign="top" align="center" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
 
| width="50%" bgcolor="#f5f5f5" valign="top" align="center" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
 
===Universities===
 
===Universities===
University Joseph Fourier, Grenoble<br/>
+
Université Grenoble Alpes, Grenoble INP<br/>
University of Rennes 1, Rennes<br/>
+
Université Rennes 1, Rennes<br/>
 
Institut National Polytechnique de Toulouse / INSA / FERIA / Université Paul Sabatier, Toulouse<br/>
 
Institut National Polytechnique de Toulouse / INSA / FERIA / Université Paul Sabatier, Toulouse<br/>
University Bordeaux 1, Bordeaux<br/>
+
Université Bordeaux 1, Bordeaux<br/>
University Lille 1, Lille<br/>
+
Université Lille 1, Lille<br/>
Ecole Normale Supérieure, Lyon<br/>
+
École Normale Supérieure, Lyon<br/>
 
| width="50%" bgcolor="#f5f5f5" valign="top" align="center" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
 
| width="50%" bgcolor="#f5f5f5" valign="top" align="center" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
 
===Regional councils===
 
===Regional councils===
 
Aquitaine<br/>
 
Aquitaine<br/>
 +
Auvergne-Rhône-Alpes<br/>
 
Bretagne<br/>
 
Bretagne<br/>
 +
Champagne-Ardenne<br/>
 
Provence Alpes Côte d'Azur<br/>
 
Provence Alpes Côte d'Azur<br/>
Nord Pas de Calais<br/>
+
Hauts de France<br/>
 
Lorraine<br/>
 
Lorraine<br/>
 
|}
 
|}

Latest revision as of 14:30, 12 February 2021

Grid'5000

Grid'5000 is a large-scale and flexible testbed for experiment-driven research in all areas of computer science, with a focus on parallel and distributed computing including Cloud, HPC and Big Data and AI.

Key features:

  • provides access to a large amount of resources: 15000 cores, 800 compute-nodes grouped in homogeneous clusters, and featuring various technologies: PMEM, GPU, SSD, NVMe, 10G and 25G Ethernet, Infiniband, Omni-Path
  • highly reconfigurable and controllable: researchers can experiment with a fully customized software stack thanks to bare-metal deployment features, and can isolate their experiment at the networking layer
  • advanced monitoring and measurement features for traces collection of networking and power consumption, providing a deep understanding of experiments
  • designed to support Open Science and reproducible research, with full traceability of infrastructure and software changes on the testbed
  • a vibrant community of 500+ users supported by a solid technical team


Read more about our teams, our publications, and the usage policy of the testbed. Then get an account, and learn how to use the testbed with our Getting Started tutorial and the rest of our Users portal.

Grid'5000 is merging with FIT to build the SILECS Infrastructure for Large-scale Experimental Computer Science. Read an Introduction to SILECS (April 2018)


Recently published documents and presentations:

Older documents:


Grid'5000 is supported by a scientific interest group (GIS) hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations. Inria has been supporting Grid'5000 through ADT ALADDIN-G5K (2007-2013), ADT LAPLACE (2014-2016), and IPL HEMERA (2010-2014).


Current status (at 2021-06-17 19:38): 1 current events, None planned (details)


Random pick of publications

Five random publications that benefited from Grid'5000 (at least 2181 overall):

  • Giuseppe Di Lena, Andrea Tomassilli, Damien Saucez, Frédéric Giroire, Thierry Turletti, et al.. Mininet on steroids: exploiting the cloud for Mininet performance. CloudNet 2019 - IEEE International Conference on Cloud Networking, Nov 2019, Coimbra, Portugal. hal-02362997 view on HAL pdf
  • Raphaël Duroselle, Denis Jouvet, Irina Illina. Unsupervised regularization of the embedding extractor for robust language identification. Odyssey 2020 - The Speaker and Language Recognition Workshop, Nov 2020, Tokyo, Japan. hal-02544156 view on HAL pdf
  • Francis Laniel, Damien Carver, Julien Sopena, Franck Wajsburt, Jonathan Lejeune, et al.. MemOpLight: Leveraging application feedback to improve container memory consolidation. NCA 2020 - 19th IEEE International Symposium on Network Computing and Applications, Nov 2020, Cambridge / Virtual, United States. pp.1-10, 10.1109/NCA51143.2020.9306717. hal-03065629 view on HAL pdf
  • Tom Cornebize, Arnaud Legrand. DGEMM performance is data-dependent. Research Report RR-9310, Université Grenoble Alpes; Inria; CNRS. 2019. hal-02401760 view on HAL pdf
  • Oleksii Avilov, Sébastien Rimbert, Anton Popov, Laurent Bougrain. Deep Learning Techniques to Improve Intraoperative Awareness Detection from Electroencephalographic Signals. IEEE Engineering in Medicine and Biology Society 2020, Jul 2020, Montreal, Canada. hal-02920320 view on HAL pdf


Latest news

Rss.svgDebian 11 "Bullseye" preview environments are now available for deployments

Debian 11 stable (Bullseye) will be released in a few weeks. We are pleased to offer a "preview" of kadeploy environments for Debian 11 (currently still Debian testing), that you can deploy already. See the debian11 environments in kaenv3.

New features and changes in Debian 11 are described in: https://www.debian.org/releases/bullseye/amd64/release-notes/ch-whats-new.en.html .

In particular, it includes many software updates:

  • Cuda 11.2 / Nvidia drivers 460.73.01
  • OpenJDK 17
  • Python 3.9.2
    • python3-numpy 1.19.5
    • python3-scipy 1.6.0
    • python3-pandas 1.1.5

(Note that Python 2 is not included as deprecated since Jan. 2020 and /usr/bin/python is symlinked to /usr/bin/python3)

  • Perl 5.32.1
  • GCC 9.3 and 10.2
  • G++ 10.2
  • Libboost 1.74.0
  • Ruby 2.7.3
  • CMake 3.18.4
  • GFortran 10.2.1
  • Liblapack 3.9.0
  • libatlas 3.10.3
  • RDMA 33.1
  • OpenMPI 4.1.0

Known regressions and problems are :

  • The std environment is not ready yet, and will not be the default Grid'5000 environment until official Debian 11 Bullseye release.
  • Cuda/Nvidia drivers do not support - out-of-the-box or at all - some quite old GPUs.
  • BeeGFS is not operational at the moment.

Let us know if you want us to support some tools, softwares,… that are not available on big images.

As a reminder, you can use the following commands to deploy an environment on nodes (https://www.grid5000.fr/w/Getting_Started#Deploying_nodes_with_Kadeploy):

 $ oarsub -t deploy -I
 $ kadeploy3 -e debian11-x64-big...

Rss.svgKadeploy: use of UUID partition identifiers and faster deployments

Up to now, kadeploy was identifying disk partitions with their block device names (e.g. /dev/sda3) when deploying a system. This no longer works reliably because of disk inversion issues with recent kernels. As a result, we have changed kadeploy to use filesystem UUIDs instead.

This change affects the root partition passed to the kernel command line as well as the generated /etc/fstab file on the system.

If you want to keep identifying the partitions using block device names, you can use the "--bootloader no-uuid" and the "--fstab no-uuid" options of g5k-postinstall, in the postinstalls/script of the description of your environment. Please refer to the "Customizing the postinstalls" section of the "Advanced Kadeploy" page: Advanced_Kadeploy#Using_g5k-postinstall

As an additional change, Kadeploy now tries to use kexec more often, which should make the first deployment of a job noticeably faster.

-- Grid'5000 Team 17:30, June 9th 2021 (CEST)

Rss.svgNew monitoring service with Kwollect is now stable

The new Grid'5000 monitoring service based on Kwollect is now stable. Kwollect will now serve requests adressed to Grid'5000 "Metrology API", i.e., from this URL:

https://api.grid5000.fr/stable/sites/SITE/metrics

The former API based on Ganglia is no longer available, and Ganglia will be removed from Grid'5000 environments starting from the next Debian version (debian11).

Main features of Kwollect are:

  • Focus on "environmental" monitoring, i.e. metrics not available from inside the nodes: electrical consumption, temperature, metrics from network equipments or nodes' BMC… but Kwollect also monitors node's metrics from Prometheus exporters
  • Support for Grid'5000 wattmeters at high frequency
  • On-demand activation of optional metrics
  • Custom metrics can be pushed by users
  • Grafana-based vizualisation

Its usage of this new service is described at: Monitoring_Using_Kwollect

Here are the main other changes since the last annoucement on December:

  • Monitoring related entries in Reference API have been cleaned (nodes' "sensors" and "wattmetre" keys removed, "pdu" moved at top level), OAR property wattmeter=SHARED and wattmeter=MULTIPLE removed
  • Visualization now uses Grafana
  • Metrics naming and polling period have been updated
  • Grid'5000 documentation has been updated
  • Bugs and performance fixes

-- Grid'5000 Team 12:00, May 21th 2021 (CEST)

Rss.svgUpgrade of the graffiti-13 node in Nancy

We are happy to announce that the graffiti-13 node on the Nancy site in production queue has been upgraded:

its 4 GeForce RTX 2080 Ti 11GB have been replaced by 4 RTX 6000 24GB GDDR6[1].

Reservation example : oarsub -q production -p "host='graffiti-13.nancy.grid5000.fr'" -I

[1] https://www.nvidia.com/fr-fr/design-visualization/quadro/rtx-6000/

-- Grid'5000 Team 17:00, May 6th 2021 (CEST)


Read more news

Grid'5000 sites

Current funding

As from June 2008, Inria is the main contributor to Grid'5000 funding.

INRIA

Logo INRIA.gif

CNRS

CNRS-filaire-Quadri.png

Universities

Université Grenoble Alpes, Grenoble INP
Université Rennes 1, Rennes
Institut National Polytechnique de Toulouse / INSA / FERIA / Université Paul Sabatier, Toulouse
Université Bordeaux 1, Bordeaux
Université Lille 1, Lille
École Normale Supérieure, Lyon

Regional councils

Aquitaine
Auvergne-Rhône-Alpes
Bretagne
Champagne-Ardenne
Provence Alpes Côte d'Azur
Hauts de France
Lorraine