- 1 A new version of tgz-g5k has been released
- 2 Enabling GPU level resource reservation in OAR
- 3 Network-level federation available between Grid'5000 and Fed4FIRE (and beyond)
- 4 Grid'5000 links to the Internet upgraded to 10Gbps
- 5 TILECS Workshop -- all presentations available
- 6 Several engineer and post-doc positions to work on Grid'5000 and Fed4FIRE
- 7 OAR properties upgrade for jobs using GPUs
- 8 TILECS workshop -- Towards an Infrastructure for Large-Scale Experimental Computer Science
- 9 More than 2000 documents in Grid'5000's HAL collection!
- 10 Availability of scientific and HPC software using "module" command
- 11 Pierre Neyron awarded Médaille de Cristal du CNRS
- 12 Disk reservation feature update: more clusters and usable from the standard environment (sudo-g5k)
- 13 New group storage service in beta testing, to replace storage5k
- 14 Support of Jumbo frames now available
- 15 Update about Kwapi status
- 16 Grid'5000 is now part of the Fed4FIRE testbeds federation
- 17 New clusters available in Grenoble
- 18 New environnements available for testing: Centos 7, Ubuntu 18.04, Debian testing
- 19 New clusters available in Lille
- 20 100 Gbps Omni-Path network now available
- 21 Change in user's home access
- 22 New cluster in Nantes: ecotype
- 23 New production cluster in Nancy: grvingt
- 24 Second interface on Lille's clusters
- 25 Updated Hardware and Network description
- 26 Debian 9 environments now use predictable network interfaces names
- 27 Spring cleanup: removal of wheezy environments, and of HTTP proxies
- 28 Looking back at the first Grid'5000-FIT school
- 29 1st Grid'5000-FIT school: call for participation
- 30 Debian 9 environments updated, with Meltdown patches
- 31 1st Grid'5000-FIT school: save the date, call for presentations and practical sessions
- 32 Disk reservation feature available on chifflet and parasilo
- 33 Standard environments upgraded to Debian 9 "stretch"
- 34 Debian 9 "stretch" environments now available for deployments
- 35 Better Docker support on Grid'5000
- 36 5th_Network_Interface_on_grisou_cluster, using a white box switch
- 37 Persistent Virtual Machine for Grid'5000 users
- 38 Grid'5000 Usage Policy updated
- 39 New production cluster in Nancy (grele)
- 40 Engineer positions available on Grid'5000 !
- 41 Grid'5000 software list updated
- 42 Grid'5000 services portfolio now available
- 43 Nvidia GTX 1080Ti GPUs now available in Lille
- 44 Connect to Grid'5000 with your browser
- 45 Debian Stretch minimal image available in beta
- 46 New "OSIRIM" storage space available to Grid'5000 users
- 47 The new disk reservation feature is available for beta testing
- 48 3rd Network Interface on paranoia cluster
- 49 Changing job walltime
- 50 New cluster "nova" available at Lyon
- 51 Grid'5000 events in your calendaring application
- 52 New clusters available in Lille
- 53 Standard environment updated (v2017011315)
- 54 Grid'5000 News now available as RSS
- 55 Deployments API changes
- 56 Grid'5000 winter school now finished
- 57 10 year later, the Grid'5000 school returns to Grenoble
- 58 Grid'5000 co-organizing the SUCCES 2015 event
A new version of tgz-g5k has been released
We have released a new version of tgz-g5k. Tgz-g5k is a a tool that allow you to extract a Grid'5000 environment tarball from a running node. The tarball can therefore be used by kadeploy to re-deploy the image on different nodes/reservations (see Advanced Kadeploy for more details)
The new version has two major improvements:
- tgz-g5k is now compatible with Ubuntu and Centos
- tgz-g5k is directly usable on frontends (you do not need to use it through ssh anymore).
To use tgz-g5k from a frontend, you can execute the following command:
frontend$ tgz-g5k -m MY_NODE -f ~/MY_TARBALL.tgz
In case of specific or non-deployed environments: - tgz-g5k can use a specific user id to access nodes, by using the parameter -u (by default tgz-g5k accesses nodes as root) - tgz-g5k can access node with oarsh/oarcp instead of ssh/scp, by using the parameter -o (by default tgz-g5k uses ssh/scp)
Note that tg5-g5k is still compatible with the previous command line. For the record, you had to use previously the following command:
frontend$ ssh root@MY_NODE tgz-g5k > ~/MY_TARBALL.tgz
-- Grid'5000 Team 15:00, 07 August 2019 (CET)
Enabling GPU level resource reservation in OAR
We have now put in service the GPU level resource reservation in OAR in Grid'5000. OAR now allows one to reserve one (or some) of the GPUs of a server hosting many GPUs, letting the other GPUs of the server available for other jobs.
Only the reserved GPUs will be available for computing in the job, as reported by the nvidia-smi command for instance.
- To reserve one GPU in a site, one can now run: $ oarsub -l gpu=1 ...
- To reserve 2 GPUs on a host which possibly has more than 2 GPUs, one can run: $ oarsub -l host=1/gpu=2 ...
- To reserve whole nodes (servers) with all GPUs, one can still run: $ oarsub -l host=3 -p "gpu_count > 0"
- To reserve specific GPU models, one can use the "gpu_model" property in a filter: $ oarsub -l gpu=1 -p "gpu_model = 'Tesla V100'"
- One can also filter on the cluster name after looking at the hardware pages for the description of the clusters: $ oarsub -l gpu=1 -p "cluster = 'chifflet'"
Finally, please notice that the drawgantts offer options to display GPUs.
-- Grid'5000 Team 17:00, 10 July 2019 (CET)
Network-level federation available between Grid'5000 and Fed4FIRE (and beyond)
It is now possible to connect Grid'5000 resources to other testbeds from Fed4FIRE. This is implemented by on-demand "stitching" between KaVLAN VLANs and VLANs provided by RENATER and GEANT that connect us to a Software Defined Exchange (SDX) hosted by IMEC in Belgium. This SDX is also connected to other SDXes in the US, which should make it possible to combine resources from Grid'5000 and US testbeds such as GENI, Chameleon or CloudLab.
For more information, see https://www.grid5000.fr/w/Fed4FIRE_VLAN_Stitching
-- Grid'5000 Team 15:00, 9 July 2019 (CET)
Thanks to Renater which provides Grid'5000 network connectivity, we just upgraded our connection to the Internet to 10Gbps.
You should experience faster downloads and an increased speed when loading your data on the platform!
-- Grid'5000 Team 08:00, 9 July 2019 (CET)
TILECS Workshop -- all presentations available
The TILECS workshop (Towards an Infrastructure for Large-Scale Experimental Computer Science) was held last week in Grenoble. About 90 participants gathered to discuss the future of experimental infrastructures for Computer Science.
The slides for all presentations are now available on the workshop website: https://www.silecs.net/tilecs-2019/programme/
-- Grid'5000 Team 10:00, 8 July 2019 (CET)
Several engineer and post-doc positions to work on Grid'5000 and Fed4FIRE
We are hiring! Several positions are open.
Two engineer positions:
Those two positions can be located in Grenoble, Lille, Lyon, Nancy, Nantes, Rennes or Sophia Antipolis. Both freshly graduated engineers and experienced engineers are welcomed (salary will depend on profile).
We are also hiring a post-doc in Nancy, in the context of Grid'5000 and the Fed4FIRE testbeds federation, on experimentation and reproducible research (see job offer)
To apply to any of those positions, use the Inria website or contact email@example.com directly with motivation letter and curriculum vitae.
-- Grid'5000 Team 10:00, 28 June 2019 (CET)
OAR properties upgrade for jobs using GPUs
In order to allow GPU level resource reservations in OAR, we have to adapt some OAR properties.
Currently, there are two GPU-related properties that one can use in OAR: gpu and gpu_count (see: https://www.grid5000.fr/w/OAR_Properties#gpu ).
They can for instance allow one to reserve 3 hosts equipped with 2 GPUs GTX 1080 Ti: $ oarsub -p "gpu='GTX 1080 Ti' and gpu_count='2'" -l host=3 ...
The first required change is to rename the 'gpu' property to 'gpu_model'. The new oarsub command will hence become: $ oarsub -p "gpu_model='GTX 1080 Ti' and gpu_count='2'" -l host=3 ...
(the gpu property will later be repurposed to give the identifier of the GPU resource instead of the model)
Please make sure to update your scripts according to this change.
A maintenance will be operated on 2019-06-13 between 10:00 and 12:00 where all Grid'5000 will be unavailable for reservation, in order to achieve this change.
In second stage, in a future maintainance, we will add new OAR properties, allowing GPU level resource reservations.
-- Grid'5000 Team 10:00, 12 June 2019 (CET)
TILECS workshop -- Towards an Infrastructure for Large-Scale Experimental Computer Science
Grid'5000 has joined forced with the FIT community to build SILECS, an unprecedented tool to address challenges from today's digital sciences.
The TILECS workshop is an event aiming at getting a better understanding of (1) experimental needs in the targeted fields; (2) existing testbeds. The end goal is to define a roadmap and priorities for the SILECS project.
-- Grid'5000 Team 10:00, 27 May 2019 (CET)
More than 2000 documents in Grid'5000's HAL collection!
Grid'5000 uses a collection in the HAL Open Archive to track publications that benefited from the testbed. This collection now includes more than 2000 documents, including 1396 publications (journals and conferences articles, book chapters, etc.), 268 PhD theses, 44 HDRs, and 302 unpublished documents such as research reports !
-- Grid'5000 Team 10:00, 13 May 2019 (CET)
Availability of scientific and HPC software using "module" command
Additional scientific and HPC related software are now available on Grid'5000 nodes and frontends, using the "module" command. For instance, newest versions of GCC, Cuda, OpenMPI are available this way.
To get the list of available software, use:
To load a specific software into your environment, use:
module load <software_name>
Please note that module command overloads some of your environment variables, such $PATH (internally we use Spack to build those modules).
Please also note that some software requires an access to a licence server.
More information at : https://www.grid5000.fr/w/Software_using_modules
-- Grid'5000 Team 17:00, 6 May 2019 (CET)
Pierre Neyron awarded Médaille de Cristal du CNRS
Pierre Neyron, who has been a member of the Grid'5000 technical staff since the early years of the project, has been awarded the "Médaille de Cristal de CNRS".
La médaille de cristal distingue les ingénieurs, techniciens et agents administratifs qui, par leur créativité, leur maîtrise technique et leur sens de l'innovation, contribuent aux côtés des chercheurs à l'avancée des savoirs et à l'excellence de la recherche française.
-- Grid'5000 Team 13:00, 16 April 2019 (CET)
Disk reservation feature update: more clusters and usable from the standard environment (sudo-g5k)
The disk reservation feature, which allows one to reserve on node local disks, is now available on more clusters:
See the Disk reservation tutorial to understand how to take benefit from this feature.
Also please note that reserved disks are now also exploitable using
sudo-g5k in the standard environment (does not require a deploy job).
-- Grid'5000 Team 13:00, 20 March 2019 (CET)
New group storage service in beta testing, to replace storage5k
In order provide large, persistent and shareable storage among a group of user, we are introducing a new storage service. This service is now available for beta testing at Lille: if you need to store large amounts of data in Grid'5000, we recommend that you try it!
See https://www.grid5000.fr/w/Group_Storage for details.
The service will be extended to some other sites after the beta testing phase.
This service aims at replacing storage5k, and existing storage5k services will be shutdown. On Nancy and Luxembourg, the storage5k service will be retired on 2019-03-19, to free space in the server room for new machines. Other storage5k servers (on other sites) will be shutdown after the end of the beta phase of the new service.
If you are currently using Storage5k, your best options are:
- move data to your home directory (after requesting a disk quota extension if needed)
- move data to OSIRIM
- move data to the group storage service in Lille
See https://www.grid5000.fr/w/Storage for an overview of our storage options. (Note that we currently have two open issues with quota extensions and OSIRIM. If you submit a quota extension but do not receive a reply, or if you cannot access OSIRIM, please contact firstname.lastname@example.org)
-- Grid'5000 Team 15:00, 14 March 2019 (CET)
Support of Jumbo frames now available
We are pleased to announce that all network equipments of Grid'5000 are now configured to support Jumbo frames (that is, large ethernet frames). We support an MTU (Maximum Transmission Unit) of 9000 bytes everywhere (including between Grid'5000 sites).
By default the reference and standard environments are still configured with a default MTU of 1500, but you can change the configuration (ip link set dev <device> mtu 9000) if needed. The same MTU value works from inside KaVLAN networks.
-- Grid'5000 Team 16:10, 06 March 2019 (CET)
Update about Kwapi status
For some time now there were several issues with kwapi monitoring of energy consumption and network traffic on Grid5000.
After some investigations, we made these actions to fix the problems:
- Network monitoring has been disabled in Kwapi (servers were overloaded)
- Code has been optimized and many bugs have been fixed.
- Reliability problems with measurements made on some Power Delivery Units have been identified and Kwapi has been disabled on clusters where these problems are too important.
- The time resolution of some PDUs has been updated in the reference API
Some details about PDU measurements issues can be found here: https://www.grid5000.fr/mediawiki/index.php/Power_Monitoring_Devices#measurement_artifacts_and_pitfalls
The current status of Kwapi energy monitoring can be checked here: https://intranet.grid5000.fr/jenkins-status/?job=test_kwapi For every clusters marked "green", Kwapi can be considered to be functional.
Cluster where dedicated monitoring devices are available (Lyon and Grenoble) are still fully fonctional.
-- Grid'5000 Team 10:30, 29 January 2019 (CET)
Grid'5000 is now part of the Fed4FIRE testbeds federation
In the context of the Fed4FIRE+ H2020 project, Grid'5000 joined the Fed4FIRE federation and is now listed as one of its testbeds.
There is still ongoing work in order to allow the use of Grid'5000 resources using Fed4FIRE API and tools.
For more information, see:
-- Grid'5000 Team 10:30, 10 January 2019 (CET)
New clusters available in Grenoble
We have the pleasure to announce that the 2 new clusters in Grenoble are now fully operational:
- dahu: 32x Dell PE C6420, 2 x Intel Xeon Gold 6130 (Skylake, 2.10GHz, 16 cores), 192 GiB RAM, 240+480 GB SSD + 4.0 TB HDD, 10 Gbps Ethernet + 100 Gbps Omni-Path
- yeti: 4x Dell PE R940, 4 x Intel Xeon Gold 6130 (Skylake, 2.10GHz, 16 cores), 768 GiB RAM, 480 GB SSD + 2x 1.6 TB NVME SSD + 3x 2.0 TB HDD, 10 Gbps Ethernet + 100 Gbps Omni-Path
These nodes share a Omnipath network with 40 additionnal "dahu"s, operated by GRICAD (Mésocentre HPC of Univ. Grenoble-Alpes) and are equipped with high frequency power monitoring devices (same wattmeters as in Lyon). This equippement was mainly funded by the LECO CPER (FEDER, Région Auvergne-Rhone-Alpes, DRRT, Inria), and the COMUE Univ. Grenoble-Alpes.
-- Grid'5000 Team 15:00, 20 December 2018 (CET)
New environnements available for testing: Centos 7, Ubuntu 18.04, Debian testing
Three new environments are now available, and registered on all sites with Kadeploy: Centos 7 (centos7-x64-min), Ubuntu 18.04 (ubuntu1804-x64-min), Debian testing (debiantesting-x64-min).
They are in a beta state: at this point, we welcome feedback from users about those environments (issues, missing features, etc.). We also welcome feedback on other environments that would be useful for your experiments, and that are currently missing from the set of environments provided on Grid'5000.
Those environments are built with Kameleon <http://kameleon.imag.fr/>. Recipes are available in the 'newrecipes' branch of the environments-recipes git repository: https://github.com/grid5000/environments-recipes/tree/newrecipes
Our testing indicates that those environments work fine on all clusters, except in those cases:
- Debian testing on granduc (luxembourg) and sagittaire (lyon): deployment fails due to lack of entropy after boot (related to the fix for CVE-2018-1108)
- Debian testing on chifflot (lille): deployment fails because the predictable naming of network interfaces changed
-- Grid'5000 Team 15:28, 19 December 2018 (CET)
New clusters available in Lille
We have the pleasure to announce that 2 new clusters are available in Lille:
- chifflot : 8 Dell PE R740 nodes with 2 x Intel Xeon Gold 6126 12C/24T, 192GB DDR4, 2 x 447 GB SSD + 4 x 3.639 TB SAS including
- chifflot-[1-6] nodes with 2 Nvidia P100
- chifflot-[7-8] nodes with 2 Nvidia V100
- chiclet : 8 Dell PE R7425 nodes with 2 x AMD EPYC 7301 16C/32T, 128GB DDR4
These nodes are connected with 25Gb Ethernet to a Cisco Nexus9000 switch (reference: 93180YC-EX)
The extension of the hardware equipment at the Lille's site of Grid'5000 is part of the Data (Advanced data science and technologies) CPER project carried by Inria, with the support of the regional council of Hauts-de-France, FEDER and the State.
-- Grid'5000 Team 16:40, 12 November 2018 (CET)
100 Gbps Omni-Path network now available
Omni-Path networking is now available on Nancy's grele and grimani clusters.
On the software side, support is provided in Grid'5000 environments using packages from the Scibian distribution, a Debian-based distribution for high-performance computing started by EDF.
OpenMPI automatically detects and uses Omni-Path when available. To learn more about how to use it, refer to the Run MPI on Grid'5000 tutorial.
More new clusters with Omni-Path networks are in the final stages of installation. Stay tuned for updates!
-- Grid'5000 Team 15:10, 25 September 2018 (CET)
Change in user's home access
During this week, we will change the access policy of home directories in order to improve the security and the privacy of data stored in your home directory.
This change should be transparent to most users, as:
- You will still have access to your local user home from reserved nodes when using the standard environment or when deploying custom environments such as 'nfs' or 'big'.
- You can still access every user's home from frontends (provided the access permissions allow it).
However, in other situations (give access to your home to other users, mount home directory from another site or inside a virtual machine or from a VLAN, ...), you will need to explicitely allow access using the new Storage Manager API. See https://www.grid5000.fr/w/Storage_Manager
Note that this change requires a switch to autofs to mount /home in user environments. This should be transparent in most cases because g5k-postinstall (used by all reference environments) has been modified. However, if you use an old environment, it might require a change either to switch to g5k-postinstall, or to switch to autofs. Contact us if needed.
-- Grid'5000 Team 14:40, 4 September 2018 (CET)
New cluster in Nantes: ecotype
We have the pleasure to announce a new cluster called "ecotype" hosted at the IMT Atlantique campus located in Nantes. It features 48 Dell Powerdege R630 nodes with 2 Intel Xeon E5-2630L v4, 10C/20T, 128GB DDR4, 372GB SSD and 10Gbps Ethernet.
This cluster has been funded by the CPER SeDuCe (Regional concil of the Pays de la Loire, Nantes Metropole, Inria Rennes - Bretagne Atlantique, IMT Atlantique and the French government).
-- Grid'5000 Team 16:30, 29 August 2018 (CET)
New production cluster in Nancy: grvingt
We are happy to announce that a new cluster with 64 compute nodes and 2048 cores is ready in the *production queue* in Nancy !
This is the first Grid'5000 cluster based on the latest generation of Intel CPUs (Skylake).
Each node has:
- two Intel Xeon Gold 6130 (Skylake, 2.10GHz, 2 CPUs/node, 16 cores/CPU)
- 192GB of RAM
- 1TB HDD
- one 10Gbps Ethernet interface
All nodes are also connected with an Intel Omni-Path 100Gbps network (non-blocking).
This new cluster, named "grvingt", is funded by CPER CyberEntreprises (FEDER, Région Grand Est, DRRT, INRIA, CNRS).
As a reminder the specific rules for the "production" queue are listed on https://www.grid5000.fr/w/Grid5000:UsagePolicy#Rules_for_the_production_queue
 Important note regarding pronunciation: despite the Corsican origin, you should pronounce the trailing T as this is how "vingt" is pronounced in Lorraine.
-- Grid'5000 Team 17:30, 25 July 2018 (CET)
Second interface on Lille's clusters
A second 10Gbps interface is now connected on the chetemi and chifflet clusters. Those interfaces are connected to the same switch as the first. KaVLAN is also available.
-- Grid'5000 Team 16:00, 23 July 2018 (CET)
Updated Hardware and Network description
A detailed description of all Grid'5000 resources is now available on the Grid'5000 wiki, and generated on a regular basis from the Reference API, so that it stays up-to-date. (As you probably know, the Grid'5000 reference API provides a full description of Grid'5000 as JSON documents). Check out those pages:
- Global hardware overview : https://www.grid5000.fr/w/Hardware
- Sites hardware pages :
- Sites network pages :
-- Grid'5000 Team 09:00, 2 July 2018 (CET)
Debian 9 environments now use predictable network interfaces names
All our Debian 9 environments have now been modified to use predictable network interfaces names (eno1, ens1, enp2s0, etc. instead of the traditional eth0, eth1, etc.), which is now the default on Debian and most other distributions. You can read more about this standard on https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames .
Because Grid'5000 nodes use different hardware in each cluster, the default network interface name varies, depending on whether it is included on the motherboard or an external NIC plugged in a PCI Express slot.
The mapping between old and new names is available on each site's Hardware page, such as https://www.grid5000.fr/mediawiki/index.php/Nancy:Hardware .
Some useful commands:
- ip link show up : show network interfaces that are up
- ip -o link show up : show network interfaces that are up (one-line format)
- ip addr show up : show network addresses for interfaces that are up
- ip -4 addr show up : show network addresses for interfaces that are up (IPv4 only)
- ip -o -4 addr show up : show network addresses for interfaces that are up (IPv4 only, one-line format)
If you want to stick with the old behaviour, you can either:
- use older versions of the environments, available in /grid5000
- change the environment description (kadeploy .dsc file) and add "--net traditional-names" to the call to g5k-postinstall.
Typically, that would mean:
- Get the current environment description:
kaenv3 -p debian9-x64-min > myenv.dsc
- edit myenv.dsc, find the "script:" line, and add "--net traditional-names" to make it, for example:
script: g5k-postinstall --net debian --net traditional-names
- deploy using: kadeploy3 -a myenv.dsc -m mynode
-- Grid'5000 Team 10:00, 17 May 2018 (CET)
Spring cleanup: removal of wheezy environments, and of HTTP proxies
We recently removed some old cruft.
First, the Debian 7 "wheezy" environments were removed from the Kadeploy database. They remain available on each frontend under /grid5000 for the time being. As a reminder, older environments are archived and still available : see /grid5000/ README.unmaintained-envs for details.
Second, we finally removed the HTTP proxies, that were used in the past to access the outside world from Grid'5000. This removal might cause problems if you use old environments that still rely on HTTP proxies. However, you can simply drop their use.
An upcoming change is the switch to predictable names for network interfaces in our Debian9 environments. More information on this will follow when it is deployed.
-- Grid'5000 Team 11:00, 26 April 2018 (CET)
Looking back at the first Grid'5000-FIT school
The first Grid’5000-FIT school was held in Sophia Antipolis, France from April 3rd to April 6th, 2018.
During three full days, more than 90 researchers (among them 30 PhD and 8 master students) studied jointly the two experimental research infrastructures Grid’5000 and FIT.
In addition to high profile invited speakers and more focused presentations for testbeds users, the program included 30 hours of parallel hands-on sessions. As such, introductory and advanced lab sessions on Grid’5000, FIT IoT-Lab, FIT CorteXlab and FIT R2lab were organized by the testbeds developers, each attracting dozens of attendants. See http://www.silecs.net/1st-grid5000-fit-school/program/ for more details on the presentations and on the hands-on sessions.
The last day, a hackathon involving several participating ad-hoc teams was organized. The hackathon aimed to use heterogeneous testbed resources as a substrate for building an IoT application from collection of IoT-originating data to cloud-based dashboard.
This event is the first public milestone towards the SILECS project that aims at bringing together both infrastructures. See http://www.silecs.net/ for more details on SILECS.
-- Grid'5000 Team 11:00, 12 April 2018 (CET)
1st Grid'5000-FIT school: call for participation
As previously announced, the first Grid'5000-FIT school will be held in Sophia Antipolis, France from April 3rd to April 6th, 2018. This event is a first public milestone in the SILECS project, that aims at bringing together both infrastructures.
The program is now available, and includes a great set of invited talks, of talks from Grid'5000 and FIT users, and tutorials on experimenting using those infrastructures.
Registration is free but mandatory. Please register!
-- Grid'5000 Team 14:00, 9 March 2018 (CET)
Debian 9 environments updated, with Meltdown patches
As you probably know, a serie of vulnerabilities has recently been discovered in x86 processors[1,2]. This update of our environments is the first to include a mitigation patch for it. However, this patch causes an important performance penalty for syscall-intensive workloads (e.g. performing a lot of I/O). We decided to follow Linux kernel defaults, and activate this patch by default, as it seems that we will all have to live with it for the foreseable future.
However, if you want to avoid this performance penalty for your experiments, you have two options: 1) Use version 2017122808 of our environments (which does not include the updated kernel) 2) Disable Kernel Page Table Isolation, by adding pti=off to the kernel command line. To do that, after deployment, edit /etc/default/grub, add pti=off to GRUB_CMDLINE_LINUX_DEFAULT, run update-grub, reboot
Many evaluations of the performance penalty are available online[3,4].
Here is an instance, measured on the grele cluster, using 'time dd if=/dev/zero of=/dev/zero bs=1 count=20000000', which is very close to a worst-case workload. The results are for one run, but they are fairly stable over time.
With environment version 2017122808:
# uname -srv Linux 4.9.0-4-amd64 #1 SMP Debian 4.9.65-3+deb9u1 (2017-12-23) # dpkg -l |grep linux-image ii linux-image-4.9.0-4-amd64 4.9.65-3+deb9u1 amd64 Linux 4.9 for 64-bit PCs ii linux-image-amd64 4.9+80+deb9u2 amd64 Linux for 64-bit PCs (meta-package) # time dd if=/dev/zero of=/dev/zero bs=1 count=20000000 20000000 bytes (20 MB, 19 MiB) copied, 4.20149 s, 4.8 MB/s real 0m4.203s user 0m1.168s sys 0m3.032s
With environment 2018020812, PTI enabled (default):
# uname -srv Linux 4.9.0-5-amd64 #1 SMP Debian 4.9.65-3+deb9u2 (2018-01-04) # dpkg -l |grep linux-image ii linux-image-4.9.0-5-amd64 4.9.65-3+deb9u2 amd64 Linux 4.9 for 64-bit PCs ii linux-image-amd64 4.9+80+deb9u3 amd64 Linux for 64-bit PCs (meta-package) # time dd if=/dev/zero of=/dev/zero bs=1 count=20000000 20000000 bytes (20 MB, 19 MiB) copied, 10.8027 s, 1.9 MB/s real 0m10.804s user 0m3.948s sys 0m6.852s
With environment 2018020812, PTI disabled in grub configuration:
# uname -srv Linux 4.9.0-5-amd64 #1 SMP Debian 4.9.65-3+deb9u2 (2018-01-04) root@grele-1:~# cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz-4.9.0-5-amd64 root=UUID=e803852f-b7a8-454b-a7e8-2b9d9a4da06c ro debian-installer=en_US quiet pti=off # dpkg -l |grep linux-image ii linux-image-4.9.0-5-amd64 4.9.65-3+deb9u2 amd64 Linux 4.9 for 64-bit PCs ii linux-image-amd64 4.9+80+deb9u3 amd64 Linux for 64-bit PCs (meta-package) # time dd if=/dev/zero of=/dev/zero bs=1 count=20000000 20000000 bytes (20 MB, 19 MiB) copied, 4.10788 s, 4.9 MB/s real 0m4.109s user 0m1.160s sys 0m2.948s
1st Grid'5000-FIT school: save the date, call for presentations and practical sessions
It is our pleasure to announce the first Grid’5000-FIT school, that will be held in Sophia Antipolis, France from April 3rd to April 6th, 2018. This event is a first public milestone in the SILECS project, that aims at bringing together Grid'5000 and FIT infrastructures.
The program is being polished and will be available soon on http://www.silecs.net/1st-grid5000-fit-school/. The week of work will give priority to practical sessions where you will use the FIT and Grid’5000 platforms and learn how to conduct scientific and rigorous experiments. Moreover, to foster collaboration, we will arrange a special session where you can present your own work (see “call for presentations” below) or propose your own practical session (see “Call for practical session” below).
We would like to stress in this announcement that we hope users will consider presenting their great work during the school, including already published results. The focus of user presentations during the school is not so much on new results but on showcasing great uses of Grid’5000 and/or FIT platforms, singly or together, with a focus on how the results have been derived from the platforms usage.
Registration will be free but mandatory due to the limited number of seats.
Stay tuned for extra details on http://www.silecs.net/1st-grid5000-fit-school/.
-- Grid'5000 Team 10:00, 1 February 2018 (CET)
Disk reservation feature available on chifflet and parasilo
The disk reservation feature is now available on chifflet in Lille and parasilo in Rennes, in addition to grimoire in Nancy, which was already in beta testing phase. Disk reservation enables users to leave large datasets on nodes between experiments. See Disk reservation wiki page.
-- Grid'5000 Team 11:30, 8 January 2018 (CET)
Standard environments upgraded to Debian 9 "stretch"
All Grid'5000 nodes now use Debian 9 "stretch" as standard environment (this is the environment available by default when not deploying).
Known remaining problems at this time are: - Lyon, orion cluster: GPUs are no longer supported by CUDA 9 wich is the version now included in debian 9 environments. - Infiniband support: while we think we have resolved everything, we are not 100% sure at this point (it needs further testing). - Virtual machines on the standard environment don't work anymore (but work if you deploy, of course)
Your own scripts or applications might require changes or recompilation to work on Debian 9. If we want to stick with Debian 8 for the time being, you can still manually deploy the jessie-x64-std environment.
-- Grid'5000 Team 14:30, 15 December 2017 (CET)
Debian 9 "stretch" environments now available for deployments
Kadeploy images are now available for Debian 9.
We used this opportunity to change the naming scheme, so they are named debian9-x64-min, debian9-x64-base, debian9-x64-nfs, debian9-x64-big.
Known regressions and problems are :
- The Xen environment is not available yet
- Network interfaces use the traditional naming scheme (eth0, eth1, ...) not yet the new predictable scheme
- Intel Xeon Phi (for the graphite cluster in Nancy) is not supported yet
- Infiniband network support is currently not tested
We are planning to switch the "std" environment (the one available when not deploying) in the coming weeks. You can beta-test the environment that we plan to switch to (debian9-x64-std).
Please report problems you encounter with the above environments to email@example.com.
-- Grid'5000 Team 16:30, 29 November 2017 (CET)
Better Docker support on Grid'5000
Docker support on Grid'5000 has been enhanced.
We now document a set of tools and recommendations to ease Docker usage on Grid'5000.
More details are available on Docker page.
-- Grid'5000 Team 17:30, 28 November 2017 (CET)
5th_Network_Interface_on_grisou_cluster, using a white box switch
A 5th network interface is now connected on grisou¹, providing a 1 Gbps link, as opposed to the first four interfaces that provide 10 Gbps links.
Those interfaces are connected to a Dell S3048 Open Networking switch. Alternative operating systems are available for this switch. While we did not make specific developments to facilitate their use, please talk with us if you are interested in getting control over the switch in order to experiment on alternative software stacks.
-- Grid'5000 Team 17:00, 8 November 2017 (CET)
Persistent Virtual Machine for Grid'5000 users
We are happy to announce the availability of persistent virtual machines in Grid'5000: each user can now get one virtual machine that will run permanently (outside of OAR reservations). We envision that this will be helpful to host the stable/persistent part of long-running experiments.
More details are available here : https://www.grid5000.fr/mediawiki/index.php/Persistent_Virtual_Machine
-- Grid'5000 Team 10:00, 7 November 2017 (CET)
Grid'5000 Usage Policy updated
The Grid'5000 Usage Policy has been updated.
In a nutshell, all usage that was previously allowed is still allowed after this update.
However, there are several changes to help increase usage of resources that are immediately available.
- Jobs of duration shorter or equal to one hour, whose submission is done less than 10 minutes before the job starts, are excluded from daily quotas.
- This means that one can always reserve resources for up to one hour when they are immediately available.
- Similarly, job extensions requested less than 10 minutes before the end of the job, and for a duration of one hour or less, are also excluded from daily quotas. Those extensions can be renewed several times (always during the last 10 minutes of the job).
- This means that, when resources are still available, one can always extend jobs for up to one hour.
- Crossing the 19:00 boundary is allowed for jobs submitted at or after 17:00 the same day. The portion of those jobs from 17:00 to 19:00 is excluded from daily quotas.
- This means that if at 17:00 or later on a given day, resources are not reserved for the following night, then it is possible to reserve them and start the night job earlier.
- Crossing the 9:00 boundary is allowed for jobs submitted on the same day. But the portion of those jobs after 9:00 is still included in the daily quota.
- This means that when resources are free in the morning, people are free to start working earlier.
Note that with those changes, it might become harder to get resources if you do not plan at least one hour ahead.
Another change is that the usage policy now includes rules for disk reservations.
Comments about those new rules are welcomed on the users mailing list.
-- Grid'5000 Team 14:00, 21 September 2017 (CET)
New production cluster in Nancy (grele)
We are glad to announce that a new cluster with 14 compute nodes is ready in the production queue in Nancy ! There are 2 Nvidia GTX 1080Ti per node, providing a total of 100352 GPU cores. They are also equipped with :
- 2x Intel Xeon E5-2650 v4 (2.2GHz, 12C/24T)
- 128GB of RAM
- 2x 300GB SAS HDD 15K rpm
Intel Omni-Path network 100Gbps is not available yet, but should be soon. Note that, Grimani cluster will take advantage of this High speed Low latency network too.
This new High-End cluster, named "Grele", is jointly supported by CPER IT2MP (Innovations Technologiques, Modélisation et Médecine Personnalisée) and CPER LCHN (Langues, Connaissances & Humanités Numériques).
PS : At the moment grele-13 and grele-14 don't have GPUs, but they should arrive really soon ! PS2 : As a reminder, rules for the production queue : https://www.grid5000.fr/mediawiki/index.php/Grid5000:UsagePolicy#Rules_for_the_production_queue
-- Grid'5000 Team 14:00, 10 July 2017 (CET)
Engineer positions available on Grid'5000 !
-- Grid'5000 Team 14:00, 5 July 2017 (CET)
Grid'5000 software list updated
The Software page of the Grid'5000 wiki received a major overhaul. It lists many pieces of software of interest to Grid'5000 users, developed by the Grid'5000 community or the Grid'5000 team.
Please don't hesitate to suggest additions to that list!
-- Grid'5000 Team 13:00, 27 June 2017 (CET)
Grid'5000 services portfolio now available
The Grid'5000 services portfolio (in french: offre de service) is now available. This document (in french only, sorry), prepared by the architects committee together with the technical team, lists all services provided by Grid'5000 to its users. Those services are ranked according to their importance, measured as the resolution delay the team is aiming for when something breaks.
This first version is a good opportunity to provide feedback about missing services, over-inflated priority, under-evaluated priority, etc. Please tell us what is important for your experiments!
-- Grid'5000 Team 14:32, 22 June 2017 (CET)
Nvidia GTX 1080Ti GPUs now available in Lille
We are glad to announce that 16 GPUs have been added to the chifflet cluster. There are 2 Nvidia GTX 1080Ti per node, providing a total of 57344 gpu-cores.
This takes place in the context of the "Advanced data science and technologies" CPER project in which Inria invests, with the support of the regional council of the Hauts de France, the FEDER and the State.
-- Grid'5000 Team 16:00, 14 June 2017 (CET)
Connect to Grid'5000 with your browser
We are pleased to announce a new feature which we hope will ease the first steps on Grid5000. You can now get a shell through a web interface at this address: https://intranet.grid5000.fr/shell/SITE/ . For example, for the Lille site, use: https://intranet.grid5000.fr/shell/lille/ To connect you will have to type in your credentials twice (first for the HTTP proxy, then for the SSH connection).
-- Grid'5000 Team 11:00, 9 June 2017 (CET)
Debian Stretch minimal image available in beta
Grid'5000 provides a minimal image of Debian Stretch, featuring Linux 4.9, from now on in beta testing. If you use it and encounter problems, please send a report to : firstname.lastname@example.org
We are also currently working on the other variants of Stretch images (-base, -std, -xen, -big), as well as improvements such as support for predictable network interfaces names. Stay tuned for more information in few weeks.
-- Grid'5000 Team 17:00, 23 May 2017 (CET)
New "OSIRIM" storage space available to Grid'5000 users
A new remote storage is available from frontends and nodes (in default environment or deployed -nfs and -big) under "/srv/osirim/<your_username>" directory.
It provides a persistent storage space of 200GB of storage to every users (using NFS and automount). If needed, more space can be requested by sending an e-mail to email@example.com.
This storage is kindly provided by the OSIRIM project, from Toulouse IRIT lab.
-- Grid'5000 Team 10:00, 15 May 2017 (CET)
The new disk reservation feature is available for beta testing
In order to leave large datasets on nodes between experiments, a disk reservation service has been developed, and is now available for beta testing on the grimoire cluster in Nancy.
See this page for more details.
The service will be extended to other clusters after the beta testing phase.
-- Grid'5000 Team 15:00, 10 May 2017 (CET)
3rd Network Interface on paranoia cluster
Users can now use the 3rd network interface on cluster paranoia at Rennes. Eth0 and eth1 provide 10G and eth2 a gigabit ethernet connection¹.
We remind you that grisou and grimoire clusters have 4 interfaces 10G in Nancy.
¹ For more details, see Rennes:Network
-- Grid'5000 Team 15:00, 09 May 2017 (CET)
Changing job walltime
The latest OAR version installed in Grid'5000 now provides the functionality to request a change to the walltime of a running job (duration of the resource reservation). Such a change is granted with respect to the resources occupation by other jobs and the resulting job characteristics still have to comply with the Grid'5000 Usage Policy¹. You may use the oarwalltime command or use the Grid'5000 API to request or query the status of such a change for a job. See the Grid'5000 tutorials and API documentation for more information.
¹ Rules from the Usage Policy still apply on the modified job. We are however considering changing the Usage Policy to take the new feature into account, and would welcome ideas toward doing that. Please feel free to make suggestions on firstname.lastname@example.org.
-- Grid'5000 Team 10:00, 20 March 2017 (CET)
New cluster "nova" available at Lyon
We have the pleasure to announce that a new cluster called "nova" is available at Lyon. It features 23 Dell R430 nodes with 2 Intel Xeon E5-2620 v4, 8C/16T, 32GB DDR4, 2x300GB SAS and 10Gbps Ethernet. Energy monitoring is available for this cluster, provided by the same devices used for the other clusters in Lyon.
This cluster has been funded by Inria Rhône-Alpes.
-- Grid'5000 Team 10:00, 20 March 2017 (CET)
Grid'5000 events in your calendaring application
Grid'5000 events, such as maintenance operations or service disruptions are published on the events page (https://www.grid5000.fr/status/). You see the summary of this page on Grid'5000's homepage, but could also get information through a RSS feed. Since last week, events are also published as an ical feed (), to which you can subscribe most calendaring applications such as Zimbra or Thunderbird's Lightning extension.
--Grid'5000 Team 13:00, 10 March 2017 (CET)
New clusters available in Lille
We have the pleasure to announce that new clusters are available on Lille's site :
- chetemi in reference to "Bienvenue chez les ch'tis" : 15 R630 nodes with 2 Xeon E5-2630 v4, 20C/40T, 256GB DDR4, 2x300GB SAS
- chifflet : 8 R730 nodes with 2 Xeon E5-2680 v4, 28C/56T, 768GB DDR4, 2x400GB SAS SSD, 2x4TB SATA
These nodes are connected with 10Gb Ethernet to a Cisco Nexus9000 switch (reference: 93180YC-EX)
The renewal of the Lille's site take place in the context of the CPER "Advanced data science and technologies" project in which Inria invest, with the support of the regional council of the Hauts de France, the FEDER and the State.
-- Grid'5000 Team 15:00, 02 March 2017 (CET)
Standard environment updated (v2017011315)
This week, we updated the standard environment to a new version (2017011315), this one includes a ganglia module to monitor NVIDIA GPUs, you can now watch GPU usage on ganglia webpage (For example : Ganglia graphique-1 )
We also updated other variants :
- Ceph installation has been moved from [base] to [big]
- libguestfs-tools has been removed from [big]
- CUDA version has been updated to 8.0.44 [big]
- Nvidia Driver has been updated to 375.26 [big]
- linux-tools package has been installed from [big]
-- Grid'5000 Team 15:00, 16 January 2017 (CET)
Grid'5000 News now available as RSS
Because mails to the users mailing list are lost to users joining Grid'5000, the technical team will now share Grid'5000 news using a blog like format. A RSS link is available from the News page, at this URL.
-- Grid'5000 Team 15:00, 20 December 2016 (CET)
Deployments API changes
-- Grid'5000 Team 15:11, 25 October 2016 (CET)
Grid'5000 winter school now finished
The Grid'5000 spring school took place between February 2nd, 2016 and February 5th, 2016 in Grenoble. Two awards were given for presentation entries:
- Best presentation award to Houssem-Eddine Chihoub and Christine Collet for their work on A scalability comparaison study of smart meter data management approaches (slides)
- Most promising experiment to Anna Giannakou for her work Towards Self Adaptable Security Monitoring in IaaS Clouds (slides)
-- Grid'5000 Team 15:11, 25 February 2016 (CET)
10 year later, the Grid'5000 school returns to Grenoble
10 year after the first edition of the Grid'5000 school, we invite the Grid'5000 community to take part in the Grid'5000 winter School 2016, February 2-5, 2016.
-- Grid'5000 Team 15:11, 25 December 2015 (CET)
Grid'5000 co-organizing the SUCCES 2015 event
These will take place November 5-6, 2015, in Paris. Dedicated to users of grid, cloud or regional HPC centers, the SUCCES 2015 days are organised by France Grilles, Grid'5000, Groupe Calcul and GDR ASR.
-- Grid'5000 Team 15:11, 25 October 2015 (UTC)
See also older news.