News

From Grid5000
Jump to: navigation, search

This page provides some news about the Grid'5000 testbed. You can also subscribe to the RSS feed and follow Grid'5000 on Twitter.

Contents

New production cluster in Nancy: grvingt

We are happy to announce that a new cluster with 64 compute nodes and 2048 cores is ready in the *production queue* in Nancy !

This is the first Grid'5000 cluster based on the latest generation of Intel CPUs (Skylake).

Each node has:

  • two Intel Xeon Gold 6130 (Skylake, 2.10GHz, 2 CPUs/node, 16 cores/CPU)
  • 192GB of RAM
  • 1TB HDD
  • one 10Gbps Ethernet interface

All nodes are also connected with an Intel Omni-Path 100Gbps network (non-blocking).

This new cluster, named "grvingt"[0], is funded by CPER CyberEntreprises (FEDER, Région Grand Est, DRRT, INRIA, CNRS).

As a reminder the specific rules for the "production" queue are listed on https://www.grid5000.fr/w/Grid5000:UsagePolicy#Rules_for_the_production_queue

[0] Important note regarding pronunciation: despite the Corsican origin, you should pronounce the trailing T as this is how "vingt" is pronounced in Lorraine.

-- Grid'5000 Team 17:30, 25 July 2018 (CET)

Second interface on Lille's clusters

A second 10Gbps interface is now connected on the chetemi and chifflet clusters. Those interfaces are connected to the same switch as the first. KaVLAN is also available.

-- Grid'5000 Team 16:00, 23 July 2018 (CET)

Updated Hardware and Network description

A detailed description of all Grid'5000 resources is now available on the Grid'5000 wiki, and generated on a regular basis from the Reference API, so that it stays up-to-date. (As you probably know, the Grid'5000 reference API provides a full description of Grid'5000 as JSON documents). Check out those pages:

-- Grid'5000 Team 09:00, 2 July 2018 (CET)

Debian 9 environments now use predictable network interfaces names

All our Debian 9 environments have now been modified to use predictable network interfaces names (eno1, ens1, enp2s0, etc. instead of the traditional eth0, eth1, etc.), which is now the default on Debian and most other distributions. You can read more about this standard on https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames .

Because Grid'5000 nodes use different hardware in each cluster, the default network interface name varies, depending on whether it is included on the motherboard or an external NIC plugged in a PCI Express slot.

The mapping between old and new names is available on each site's Hardware page, such as https://www.grid5000.fr/mediawiki/index.php/Nancy:Hardware .

Some useful commands:

  • ip link show up : show network interfaces that are up
  • ip -o link show up : show network interfaces that are up (one-line format)
  • ip addr show up : show network addresses for interfaces that are up
  • ip -4 addr show up : show network addresses for interfaces that are up (IPv4 only)
  • ip -o -4 addr show up : show network addresses for interfaces that are up (IPv4 only, one-line format)

If you want to stick with the old behaviour, you can either:

  • use older versions of the environments, available in /grid5000
  • change the environment description (kadeploy .dsc file) and add "--net traditional-names" to the call to g5k-postinstall.

Typically, that would mean:

  1. Get the current environment description:
kaenv3 -p debian9-x64-min > myenv.dsc
  1. edit myenv.dsc, find the "script:" line, and add "--net traditional-names" to make it, for example:
script: g5k-postinstall --net debian --net traditional-names
  1. deploy using: kadeploy3 -a myenv.dsc -m mynode

-- Grid'5000 Team 10:00, 17 May 2018 (CET)

Spring cleanup: removal of wheezy environments, and of HTTP proxies

We recently removed some old cruft.

First, the Debian 7 "wheezy" environments were removed from the Kadeploy database. They remain available on each frontend under /grid5000 for the time being. As a reminder, older environments are archived and still available : see /grid5000/ README.unmaintained-envs for details.

Second, we finally removed the HTTP proxies, that were used in the past to access the outside world from Grid'5000. This removal might cause problems if you use old environments that still rely on HTTP proxies. However, you can simply drop their use.

An upcoming change is the switch to predictable names for network interfaces in our Debian9 environments. More information on this will follow when it is deployed.

-- Grid'5000 Team 11:00, 26 April 2018 (CET)

Looking back at the first Grid'5000-FIT school

The first Grid’5000-FIT school was held in Sophia Antipolis, France from April 3rd to April 6th, 2018.

During three full days, more than 90 researchers (among them 30 PhD and 8 master students) studied jointly the two experimental research infrastructures Grid’5000 and FIT.

In addition to high profile invited speakers and more focused presentations for testbeds users, the program included 30 hours of parallel hands-on sessions. As such, introductory and advanced lab sessions on Grid’5000, FIT IoT-Lab, FIT CorteXlab and FIT R2lab were organized by the testbeds developers, each attracting dozens of attendants. See http://www.silecs.net/1st-grid5000-fit-school/program/ for more details on the presentations and on the hands-on sessions.

The last day, a hackathon involving several participating ad-hoc teams was organized. The hackathon aimed to use heterogeneous testbed resources as a substrate for building an IoT application from collection of IoT-originating data to cloud-based dashboard.

This event is the first public milestone towards the SILECS project that aims at bringing together both infrastructures. See http://www.silecs.net/ for more details on SILECS.

-- Grid'5000 Team 11:00, 12 April 2018 (CET)

1st Grid'5000-FIT school: call for participation

As previously announced, the first Grid'5000-FIT school will be held in Sophia Antipolis, France from April 3rd to April 6th, 2018. This event is a first public milestone in the SILECS project, that aims at bringing together both infrastructures.

The program is now available, and includes a great set of invited talks, of talks from Grid'5000 and FIT users, and tutorials on experimenting using those infrastructures.

Registration is free but mandatory. Please register!

-- Grid'5000 Team 14:00, 9 March 2018 (CET)

Debian 9 environments updated, with Meltdown patches

As you probably know, a serie of vulnerabilities has recently been discovered in x86 processors[1,2]. This update of our environments is the first to include a mitigation patch for it. However, this patch causes an important performance penalty for syscall-intensive workloads (e.g. performing a lot of I/O). We decided to follow Linux kernel defaults, and activate this patch by default, as it seems that we will all have to live with it for the foreseable future.

[1] https://en.wikipedia.org/wiki/Meltdown_(security_vulnerability)
[2] https://meltdownattack.com/

However, if you want to avoid this performance penalty for your experiments, you have two options: 1) Use version 2017122808 of our environments (which does not include the updated kernel) 2) Disable Kernel Page Table Isolation, by adding pti=off to the kernel command line. To do that, after deployment, edit /etc/default/grub, add pti=off to GRUB_CMDLINE_LINUX_DEFAULT, run update-grub, reboot

Many evaluations of the performance penalty are available online[3,4].

[3] https://www.phoronix.com/scan.php?page=article&item=5distros-post-spectre
[4] https://arxiv.org/abs/1801.04329

Here is an instance, measured on the grele cluster, using 'time dd if=/dev/zero of=/dev/zero bs=1 count=20000000', which is very close to a worst-case workload. The results are for one run, but they are fairly stable over time.

With environment version 2017122808:

# uname -srv
Linux 4.9.0-4-amd64 #1 SMP Debian 4.9.65-3+deb9u1 (2017-12-23)
# dpkg -l |grep linux-image
ii  linux-image-4.9.0-4-amd64     4.9.65-3+deb9u1              amd64        Linux 4.9 for 64-bit PCs
ii  linux-image-amd64             4.9+80+deb9u2                amd64        Linux for 64-bit PCs (meta-package)
# time dd if=/dev/zero of=/dev/zero bs=1 count=20000000
20000000 bytes (20 MB, 19 MiB) copied, 4.20149 s, 4.8 MB/s
real        0m4.203s
user        0m1.168s
sys         0m3.032s

With environment 2018020812, PTI enabled (default):

# uname -srv
Linux 4.9.0-5-amd64 #1 SMP Debian 4.9.65-3+deb9u2 (2018-01-04)
# dpkg -l |grep linux-image
ii  linux-image-4.9.0-5-amd64     4.9.65-3+deb9u2              amd64        Linux 4.9 for 64-bit PCs
ii  linux-image-amd64             4.9+80+deb9u3                amd64        Linux for 64-bit PCs (meta-package)
# time dd if=/dev/zero of=/dev/zero bs=1 count=20000000
20000000 bytes (20 MB, 19 MiB) copied, 10.8027 s, 1.9 MB/s
real        0m10.804s
user        0m3.948s
sys         0m6.852s

With environment 2018020812, PTI disabled in grub configuration:

# uname -srv
Linux 4.9.0-5-amd64 #1 SMP Debian 4.9.65-3+deb9u2 (2018-01-04)
root@grele-1:~# cat /proc/cmdline 
BOOT_IMAGE=/boot/vmlinuz-4.9.0-5-amd64 root=UUID=e803852f-b7a8-454b-a7e8-2b9d9a4da06c ro debian-installer=en_US quiet pti=off
# dpkg -l |grep linux-image
ii  linux-image-4.9.0-5-amd64     4.9.65-3+deb9u2              amd64        Linux 4.9 for 64-bit PCs
ii  linux-image-amd64             4.9+80+deb9u3                amd64        Linux for 64-bit PCs (meta-package)
# time dd if=/dev/zero of=/dev/zero bs=1 count=20000000
20000000 bytes (20 MB, 19 MiB) copied, 4.10788 s, 4.9 MB/s
real        0m4.109s
user        0m1.160s
sys         0m2.948s

1st Grid'5000-FIT school: save the date, call for presentations and practical sessions

It is our pleasure to announce the first Grid’5000-FIT school, that will be held in Sophia Antipolis, France from April 3rd to April 6th, 2018. This event is a first public milestone in the SILECS project, that aims at bringing together Grid'5000 and FIT infrastructures.

The program is being polished and will be available soon on http://www.silecs.net/1st-grid5000-fit-school/. The week of work will give priority to practical sessions where you will use the FIT and Grid’5000 platforms and learn how to conduct scientific and rigorous experiments. Moreover, to foster collaboration, we will arrange a special session where you can present your own work (see “call for presentations” below) or propose your own practical session (see “Call for practical session” below).

We would like to stress in this announcement that we hope users will consider presenting their great work during the school, including already published results. The focus of user presentations during the school is not so much on new results but on showcasing great uses of Grid’5000 and/or FIT platforms, singly or together, with a focus on how the results have been derived from the platforms usage.

Registration will be free but mandatory due to the limited number of seats.

Stay tuned for extra details on http://www.silecs.net/1st-grid5000-fit-school/.

-- Grid'5000 Team 10:00, 1 February 2018 (CET)

Disk reservation feature available on chifflet and parasilo

The disk reservation feature is now available on chifflet in Lille and parasilo in Rennes, in addition to grimoire in Nancy, which was already in beta testing phase. Disk reservation enables users to leave large datasets on nodes between experiments. See Disk reservation wiki page.

-- Grid'5000 Team 11:30, 8 January 2018 (CET)

Standard environments upgraded to Debian 9 "stretch"

All Grid'5000 nodes now use Debian 9 "stretch" as standard environment (this is the environment available by default when not deploying).

Known remaining problems at this time are: - Lyon, orion cluster: GPUs are no longer supported by CUDA 9 wich is the version now included in debian 9 environments. - Infiniband support: while we think we have resolved everything, we are not 100% sure at this point (it needs further testing). - Virtual machines on the standard environment don't work anymore (but work if you deploy, of course)

Your own scripts or applications might require changes or recompilation to work on Debian 9. If we want to stick with Debian 8 for the time being, you can still manually deploy the jessie-x64-std environment.

-- Grid'5000 Team 14:30, 15 December 2017 (CET)

Debian 9 "stretch" environments now available for deployments

Kadeploy images are now available for Debian 9.

We used this opportunity to change the naming scheme, so they are named debian9-x64-min, debian9-x64-base, debian9-x64-nfs, debian9-x64-big.

Known regressions and problems are :

  • The Xen environment is not available yet
  • Network interfaces use the traditional naming scheme (eth0, eth1, ...) not yet the new predictable scheme
  • Intel Xeon Phi (for the graphite cluster in Nancy) is not supported yet
  • Infiniband network support is currently not tested

We are planning to switch the "std" environment (the one available when not deploying) in the coming weeks. You can beta-test the environment that we plan to switch to (debian9-x64-std).

Please report problems you encounter with the above environments to support-staff@lists.grid5000.fr.

-- Grid'5000 Team 16:30, 29 November 2017 (CET)

Better Docker support on Grid'5000

Docker support on Grid'5000 has been enhanced.

We now document a set of tools and recommendations to ease Docker usage on Grid'5000.

More details are available on Docker page.

-- Grid'5000 Team 17:30, 28 November 2017 (CET)

5th_Network_Interface_on_grisou_cluster, using a white box switch

A 5th network interface is now connected on grisou¹, providing a 1 Gbps link, as opposed to the first four interfaces that provide 10 Gbps links.

Those interfaces are connected to a Dell S3048 Open Networking switch. Alternative operating systems are available for this switch. While we did not make specific developments to facilitate their use, please talk with us if you are interested in getting control over the switch in order to experiment on alternative software stacks.

¹ https://www.grid5000.fr/mediawiki/index.php/Nancy:Network#Ethernet_1G_on_grisou_nodes

-- Grid'5000 Team 17:00, 8 November 2017 (CET)

Persistent Virtual Machine for Grid'5000 users

We are happy to announce the availability of persistent virtual machines in Grid'5000: each user can now get one virtual machine that will run permanently (outside of OAR reservations). We envision that this will be helpful to host the stable/persistent part of long-running experiments.

More details are available here : https://www.grid5000.fr/mediawiki/index.php/Persistent_Virtual_Machine

-- Grid'5000 Team 10:00, 7 November 2017 (CET)

Grid'5000 Usage Policy updated

The Grid'5000 Usage Policy has been updated.

In a nutshell, all usage that was previously allowed is still allowed after this update.

However, there are several changes to help increase usage of resources that are immediately available.

  • Jobs of duration shorter or equal to one hour, whose submission is done less than 10 minutes before the job starts, are excluded from daily quotas.
    • This means that one can always reserve resources for up to one hour when they are immediately available.
  • Similarly, job extensions requested less than 10 minutes before the end of the job, and for a duration of one hour or less, are also excluded from daily quotas. Those extensions can be renewed several times (always during the last 10 minutes of the job).
    • This means that, when resources are still available, one can always extend jobs for up to one hour.
  • Crossing the 19:00 boundary is allowed for jobs submitted at or after 17:00 the same day. The portion of those jobs from 17:00 to 19:00 is excluded from daily quotas.
    • This means that if at 17:00 or later on a given day, resources are not reserved for the following night, then it is possible to reserve them and start the night job earlier.
  • Crossing the 9:00 boundary is allowed for jobs submitted on the same day. But the portion of those jobs after 9:00 is still included in the daily quota.
    • This means that when resources are free in the morning, people are free to start working earlier.

Note that with those changes, it might become harder to get resources if you do not plan at least one hour ahead.

Another change is that the usage policy now includes rules for disk reservations.

Comments about those new rules are welcomed on the users mailing list.

-- Grid'5000 Team 14:00, 21 September 2017 (CET)

New production cluster in Nancy (grele)

We are glad to announce that a new cluster with 14 compute nodes is ready in the production queue in Nancy ! There are 2 Nvidia GTX 1080Ti per node, providing a total of 100352 GPU cores. They are also equipped with :

  • 2x Intel Xeon E5-2650 v4 (2.2GHz, 12C/24T)
  • 128GB of RAM
  • 2x 300GB SAS HDD 15K rpm

Intel Omni-Path network 100Gbps is not available yet, but should be soon. Note that, Grimani cluster will take advantage of this High speed Low latency network too.

This new High-End cluster, named "Grele", is jointly supported by CPER IT2MP (Innovations Technologiques, Modélisation et Médecine Personnalisée) and CPER LCHN (Langues, Connaissances & Humanités Numériques).

PS : At the moment grele-13 and grele-14 don't have GPUs, but they should arrive really soon ! PS2 : As a reminder, rules for the production queue : https://www.grid5000.fr/mediawiki/index.php/Grid5000:UsagePolicy#Rules_for_the_production_queue

-- Grid'5000 Team 14:00, 10 July 2017 (CET)

Engineer positions available on Grid'5000 !

See the page of open positions.

-- Grid'5000 Team 14:00, 5 July 2017 (CET)

Grid'5000 software list updated

The Software page of the Grid'5000 wiki received a major overhaul. It lists many pieces of software of interest to Grid'5000 users, developed by the Grid'5000 community or the Grid'5000 team.

Please don't hesitate to suggest additions to that list!

-- Grid'5000 Team 13:00, 27 June 2017 (CET)

Grid'5000 services portfolio now available

The Grid'5000 services portfolio (in french: offre de service) is now available. This document (in french only, sorry), prepared by the architects committee together with the technical team, lists all services provided by Grid'5000 to its users. Those services are ranked according to their importance, measured as the resolution delay the team is aiming for when something breaks.

This first version is a good opportunity to provide feedback about missing services, over-inflated priority, under-evaluated priority, etc. Please tell us what is important for your experiments!

-- Grid'5000 Team 14:32, 22 June 2017 (CET)

Nvidia GTX 1080Ti GPUs now available in Lille

We are glad to announce that 16 GPUs have been added to the chifflet cluster. There are 2 Nvidia GTX 1080Ti per node, providing a total of 57344 gpu-cores.

This takes place in the context of the "Advanced data science and technologies" CPER project in which Inria invests, with the support of the regional council of the Hauts de France, the FEDER and the State.

Enjoy,

-- Grid'5000 Team 16:00, 14 June 2017 (CET)

Connect to Grid'5000 with your browser

We are pleased to announce a new feature which we hope will ease the first steps on Grid5000. You can now get a shell through a web interface at this address: https://intranet.grid5000.fr/shell/SITE/ . For example, for the Lille site, use: https://intranet.grid5000.fr/shell/lille/ To connect you will have to type in your credentials twice (first for the HTTP proxy, then for the SSH connection).

-- Grid'5000 Team 11:00, 9 June 2017 (CET)

Debian Stretch minimal image available in beta

Grid'5000 provides a minimal image of Debian Stretch, featuring Linux 4.9, from now on in beta testing. If you use it and encounter problems, please send a report to : support-staff@lists.grid5000.fr

We are also currently working on the other variants of Stretch images (-base, -std, -xen, -big), as well as improvements such as support for predictable network interfaces names. Stay tuned for more information in few weeks.

-- Grid'5000 Team 17:00, 23 May 2017 (CET)

New "OSIRIM" storage space available to Grid'5000 users

A new remote storage is available from frontends and nodes (in default environment or deployed -nfs and -big) under "/srv/osirim/<your_username>" directory.

It provides a persistent storage space of 200GB of storage to every users (using NFS and automount). If needed, more space can be requested by sending an e-mail to support-staff@lists.grid5000.fr.

This storage is kindly provided by the OSIRIM project, from Toulouse IRIT lab.

-- Grid'5000 Team 10:00, 15 May 2017 (CET)

The new disk reservation feature is available for beta testing

In order to leave large datasets on nodes between experiments, a disk reservation service has been developed, and is now available for beta testing on the grimoire cluster in Nancy.

See this page for more details.

The service will be extended to other clusters after the beta testing phase.

-- Grid'5000 Team 15:00, 10 May 2017 (CET)

3rd Network Interface on paranoia cluster

Users can now use the 3rd network interface on cluster paranoia at Rennes. Eth0 and eth1 provide 10G and eth2 a gigabit ethernet connection¹.

We remind you that grisou and grimoire clusters have 4 interfaces 10G in Nancy.

¹ For more details, see Rennes:Network

-- Grid'5000 Team 15:00, 09 May 2017 (CET)

Changing job walltime

The latest OAR version installed in Grid'5000 now provides the functionality to request a change to the walltime of a running job (duration of the resource reservation). Such a change is granted with respect to the resources occupation by other jobs and the resulting job characteristics still have to comply with the Grid'5000 Usage Policy¹. You may use the oarwalltime command or use the Grid'5000 API to request or query the status of such a change for a job. See the Grid'5000 tutorials and API documentation for more information.

¹ Rules from the Usage Policy still apply on the modified job. We are however considering changing the Usage Policy to take the new feature into account, and would welcome ideas toward doing that. Please feel free to make suggestions on devel@lists.grid5000.fr.

-- Grid'5000 Team 10:00, 20 March 2017 (CET)

New cluster "nova" available at Lyon

We have the pleasure to announce that a new cluster called "nova" is available at Lyon. It features 23 Dell R430 nodes with 2 Intel Xeon E5-2620 v4, 8C/16T, 32GB DDR4, 2x300GB SAS and 10Gbps Ethernet. Energy monitoring is available for this cluster, provided by the same devices used for the other clusters in Lyon.

This cluster has been funded by Inria Rhône-Alpes.

-- Grid'5000 Team 10:00, 20 March 2017 (CET)

Grid'5000 events in your calendaring application

Grid'5000 events, such as maintenance operations or service disruptions are published on the events page (https://www.grid5000.fr/status/). You see the summary of this page on Grid'5000's homepage, but could also get information through a RSS feed. Since last week, events are also published as an ical feed ([1]), to which you can subscribe most calendaring applications such as Zimbra or Thunderbird's Lightning extension.

--Grid'5000 Team 13:00, 10 March 2017 (CET)

New clusters available in Lille

We have the pleasure to announce that new clusters are available on Lille's site :

  • chetemi in reference to "Bienvenue chez les ch'tis" : 15 R630 nodes with 2 Xeon E5-2630 v4, 20C/40T, 256GB DDR4, 2x300GB SAS
  • chifflet : 8 R730 nodes with 2 Xeon E5-2680 v4, 28C/56T, 768GB DDR4, 2x400GB SAS SSD, 2x4TB SATA

These nodes are connected with 10Gb Ethernet to a Cisco Nexus9000 switch (reference: 93180YC-EX)

The renewal of the Lille's site take place in the context of the CPER "Advanced data science and technologies" project in which Inria invest, with the support of the regional council of the Hauts de France, the FEDER and the State.

-- Grid'5000 Team 15:00, 02 March 2017 (CET)

Standard environment updated (v2017011315)

This week, we updated the standard environment to a new version (2017011315), this one includes a ganglia module to monitor NVIDIA GPUs, you can now watch GPU usage on ganglia webpage (For example : Ganglia graphique-1 )

We also updated other variants :

  • Ceph installation has been moved from [base] to [big]
  • libguestfs-tools has been removed from [big]
  • CUDA version has been updated to 8.0.44 [big]
  • Nvidia Driver has been updated to 375.26 [big]
  • linux-tools package has been installed from [big]

-- Grid'5000 Team 15:00, 16 January 2017 (CET)


Grid'5000 News now available as RSS

Because mails to the users mailing list are lost to users joining Grid'5000, the technical team will now share Grid'5000 news using a blog like format. A RSS link is available from the News page, at this URL.

-- Grid'5000 Team 15:00, 20 December 2016 (CET)


Deployments API changes

The Deployments API available in the "sid" or "4.0" has slightly changed: It now uses the native Kadeploy API. See Deployments API page for more information.

-- Grid'5000 Team 15:11, 25 October 2016 (CET)


Grid'5000 winter school now finished

The Grid'5000 spring school took place between February 2nd, 2016 and February 5th, 2016 in Grenoble. Two awards were given for presentation entries:

  • Best presentation award to Houssem-Eddine Chihoub and Christine Collet for their work on A scalability comparaison study of smart meter data management approaches (slides)
  • Most promising experiment to Anna Giannakou for her work Towards Self Adaptable Security Monitoring in IaaS Clouds (slides)

-- Grid'5000 Team 15:11, 25 February 2016 (CET)


10 year later, the Grid'5000 school returns to Grenoble

10 year after the first edition of the Grid'5000 school, we invite the Grid'5000 community to take part in the Grid'5000 winter School 2016, February 2-5, 2016.

-- Grid'5000 Team 15:11, 25 December 2015 (CET)


Grid'5000 co-organizing the SUCCES 2015 event

These will take place November 5-6, 2015, in Paris. Dedicated to users of grid, cloud or regional HPC centers, the SUCCES 2015 days are organised by France Grilles, Grid'5000, Groupe Calcul and GDR ASR.

-- Grid'5000 Team 15:11, 25 October 2015 (UTC)



See also older news.