end-to-end (node-to-node, over the backbone) 10 Gbps data transfer (Networking)
Conducted byLucas Nussbaum
DescriptionThe goal of this experiment was to check whether it would be possible to make use of the 10 Gbps backbone directly from nodes.
I used two sites:
- Lyon, where some machines from the RESO team are connected to the Grid'5000 network using 10 GbE. (Myrinet 10G cards, connected to a Fujitsu 10G switch)
- Rennes, where the Myrinet 10G network (running in "Myrinet", not "Ethernet" mode) is connected to the site's main router via a Ethernet/Myrinet bridge (ref: 10G-SW16LC-6C2ER). So I used IP over ethernet, emulated over MX from the node to the Myrinet switch, then 10 GbE from the switch to Rennes' router.
I used the routable IP addresses in the 10. class to configure NICs at both ends. In Lyon, the myri10ge driver was used, while in Rennes, the proprietary Myrinet driver was used.
- the proprietary Myrinet driver has a bug that causes frames coming from the ethernet bridge to be broadcasted to all nodes. Fixed in driver version 1.2.8.
- TCP buffers need to be tuned.
The achieved bandwidth was a bit disappointing.
Rennes->Lyon: 3.2 Gbps max
Lyon->Rennes: 4.7 Gbps max
Possible bottlenecks (not investigated yet):
- ethernet emulation over MX -- unlikely, because node-node bandwith is much higher - measured 8.9 Gbps
- ethernet/myrinet bridge -- we would need to stress it using local nodes
- backbone network
- TCP tuning. I configured the buffer sizes to 64M to be on the safe side, but didn't play with the various TCP congestion control algorithms.The nodes in Lyon were not deployable, and kernel version 2.6.18 was used. After installing a 2.6.26 kernel, max bandwidth was about the same, but results were more reproducible.
The results were not very reproducible (only the max bandwidth is given above): the bandwidth sometimes stayed stable at a much lower rate, so the most likely suspect is a problem with TCP tuning.
I did the following experiment:
A single Myrinet node sends data through the Myrinet/Ethernet bridge to an increasing number of Ethernet nodes in Rennes. The expected result was to reach a 10 Gbps outgoing bandwidth. The bandwidth stopped increasing after three target nodes were added, at around 3 Gbps. This clearly points to the bridge as the bottleneck.
- Nodes involved: 1
- Sites involved: 2
- Minimum walltime: 1h
- Batch mode: yes
- Use kadeploy: yes
- CPU bound: yes
- Memory bound: yes
- Storage bound: no
- Network bound: yes
- Interlink bound: yes
Tools usediperf, and a custom-made bandwidth measurement tool.
Shared by: Lucas Nussbaum
Last update: 2009-02-17 22:28:56