Grid'5000: Adding support for experimentation from core-network to edge devices

The following describes parts of the SLICES-FR/Grid'5000 technical team's projects on the Grid'5000 site of Strasbourg and Toulouse

It summarizes the work in progress to allow new network experimentations from core networks to edge devices.

Core network topology emulation

The goal is to let users design topologies of interconnection between core network experimentation switches and other server nodes.

Thanks to an infrastructure ("mesh") switch, users will be able to create on-demand L2 links between experimentation switches (named "engelbourg") and server nodes (named "fleckenstein").

The figure below on the left shows an example of emulated network topology. Next on the right, the figure shows the associated real network topology.

G5k-strasbourg-network-ex3-2.png real network topology

Users will be able to fully reinstall/reconfigure all experimental pieces of equipment in the 2 green boxes (fleckenstein server nodes and engelbourg network switch nodes) thanks to Kadeploy.

Users will be able to configure VLANs on the "mesh" network switch (rose box) thanks to KaVLAN, thus creating the L2 links between the experimental equipments.

A tool such as distem may be installed on servers to emulate latency/failures.

Edge to cloud network emulation

Experiments may also involve edge devices such as the Jetsons nodes of the estats cluster of Grid'5000, for instance to study the edge-to-cloud continuum (e.g., federated learning).

Interconnection between nodes may be emulated, adding network latency/failures between the network/edge equipments.

Just like above, this will be achievable using KaVLAN to insert nodes in the network topology, in charge of adding latency/failures between the edge devices (which in reality are all in a same datacenter, picture below).

G5k-toulouse-estats.jpg

Instrumentation of the Network Operating Systems (NOS)

Allowing users to master the Operating System of network equipments is a must-have feature for network experimentations.

Therefore, the project's goal is to let users change the Network Operating System of switches, just like already possible on experimentation servers.

Thanks to the arrival of "white box switches" on the market, the following assumptions can be taken:

The conclusion is that a cluster of white-box switches can be managed like a cluster of servers from the operating system and hardware perspective (apart from the network cabling of course, discussed above).

In particular, that means that while ONIE is nowadays the standard mechanism to install NOS, it is replaceable by other server installation mechanisms, which allow more flexibility for experimentations (frequent reinstallation, customization, configuration traceability).

Grid'5000 uses the powerful Kadeploy software to achieve that role on any hardware of the platform. Work is in progress to allow users to "kadeploy" on-the-shelf NOS images in the ONIE format, thanks to one of the two options:

In conclusion, the engelbourg cluster of experimental switches will be managed just like the other Grid'5000 clusters of servers, benefiting from all the powerful features provided by the Grid'5000 stack (resource reservation, bare-metal parallel deployment tool, recipe-based OS image builder, hardware reference repository, hardware sanity certification, system regression tests, and so on).

Current status