Grid'5000 experiment

Jump to: navigation, search

Congestion Collapse in Grid'5000 (Networking)

Conducted by

Romaric Guillier, Pascale Primet


Call for Demos: Congestion control train-wreck workshop at Stanford: Call for Demos We would like to invite you to a two-day congestion control workshop, titled "Train-wrecks and congestion control: What happens if we do nothing?", which we are hosting at Stanford on 31 March - 1 April, 2008. Spurred on by a widespread belief that TCP is showing its age and needs replacing - and a deeper understanding of the dynamics of congestion control - the research community has brought forward many new congestion control algorithms. There has been lots of debate about the relative merits and demerits of the new schemes; and a standardization effort is under way in the IETF. But before the next congestion control mechanism is deployed, it will need to be deployed widely in operating systems and - in some cases - in switches and routers too. This will be a long road, requiring the buy-in of many people: Researchers, product developers and business leaders too. Our own experience of proposing new congestion control algorithms has been met with the challenge: "Show me the compelling need for a new congestion control mechanisms?", and "What will really happen to the Internet (and my business) if we keep TCP just the way it is?" As a community, we need examples that are simple to understand, and demonstrate a compelling need for change. We call them the "Train wreck scenarios". Examples might show that distribution of video over wireless in the home will come to a halt without new algorithms. Or that P2P traffic will bring the whole network crashing down. Or that huge, high-performance data-centers need new algorithms. Whatever your favorite example, we believe that if we are collectively armed with a handful of mutually agreed examples, it will be much easier to make a business case for change. Or put another way, if we can't articulate compelling examples to industry leaders, then is the cost and risk of change worth it? The goal of the workshop is to identify a handful of really compelling demonstrations of the impending train-wreck. The outcome will be a set of canonical examples that we will use to persuade industry of the need for change. We are deliberately inviting you many months ahead of time - to give you time to create your compelling train-wreck demonstration. You can choose the way you present your demonstration: You could bring equipment and show a live-demo; you could show simulations or animations; or you could produce a video showing a real or synthetic demo. Whatever method you choose, the goal is to create a case that will persuade a mildly-technical but influential business leader of the need for change. We will invite a panel of judges to give prizes for the most compelling examples in two categories: (1) The Overall Most Compelling Example, which will be judged on a combination of the technical merits and the presentation of the scenario, and (2) The Most Technically Compelling Example, which will be judged on its technical merit alone, without consideration of the way it is presented. The whole purpose of the workshop it to focus on the {\em problem}, not the solutions. We are most definitely {\em not} interested in your favorite scheme, or ours. So we need some ground-rules. \begin{center} {\em No-one is allowed to mention a specific mechanism, algorithm or proposal at any time during the workshop: Not in their talk, not in a panel, and not in questions to the speakers. The only mechanisms that will be allowed mention are: TCP (in its standard and deployed flavors), and idealized alternatives for purposes of demonstration. For example, comparing TCP with an oracle that provides instantaneous optimal rates to each flow.} \end{center} Attendance to the workshop will be by invitation only - to keep the discussion focused and lively. We will video the entire workshop and all the demonstrations, and make it publicly available on the Internet. We will make any proceedings and talks available too. The goal is to open up the demonstrations for public scrutiny and feedback after the event. The event is hosted by the Stanford Clean Slate Program - - and local arrangements will be made by Nick McKeown and Nandita Dukkipati. The workshop has received offers of support and funding from Cisco Systems and Microsoft. We hope to make a limited number of travel grants available. We have a small Program Committee (listed below) to select the final demonstrations. We are soliciting a 1-page description of your demo. Please note the following dates. Important Dates: =============== Workshop: 31 March - 1 April, 2008 1-page demo description submission deadline: 1 Dec, 2007 Notification of acceptance: 15 Dec, 2007 Final demo submissions: 17 March, 2008 Last date for demo withdrawals: 25 March, 2008 Please email your initial 1-page submissions to: Nandita Dukkipati and Nick McKeown Best wishes, and we hope to see you in Stanford! Nandita Dukkipati Nick McKeown Program Committee: ================== 1. Albert Greenberg, Microsoft Research 2. Peter Key, Microsoft Research 3. Flavio Bonomi, Cisco 4. Bruce Davie, Cisco 5. Steven Low, Caltech 6. Frank Kelly, Cambridge 7. Lars Eggert, Nokia Research/IETF 8. Nick McKeown, Stanford 9. Guru Parulkar, Stanford 10. Nandita Dukkipati, Stanford/Cisco/Princeton --------------------------------------------------------------------------------------------- * Context and objectives Today, the capacity and the usage of the Internet is funda- mentally changing. In the forthcoming year, millions homes will have access to the Internet with fibber lines. Network companies will offer speeds on fibber of up to 1 Gbps. Con- sequently ultra-high-speed applications will be enabled for low-cost, such as high-definition teleconferencing, telemedicine and advanced telecommuting for people working from home. We then propose a demonstration which aims at showing that the traditional congestion control approach may present a strong barrier to the large scale deployment of these excit- ing and very useful envisioned applications exploiting of the emerging huge access capacities. * Scenario description Within this scenario, we plan to show that without any special mechanisms to script the organisation of these trans- fers, the low aggregation level that could be found in this kind of networks quickly lead to a huge congestion level, disrupting the bulk data transfers and deeply impacting all interactive traffic. We will use a hierarchical dumbbell, with 10 Gbps bottle- necks as seen in Figure 1. 100 independent sources in a real testbed will generate bulk data transfers. Other sources will create interactive sessions in both directions. All sources will be using 1 Gbps Ethernet network cards, which means that the aggregation level will be 10 for the local level and 1 for the Internet level. As an addition, we plan to use an hardware latency emulator to simulate long distance links. Two kinds of workloads will be injected: · Long running heavy flows, simulating input data needed from distributed computing, whose sizes are following a distribution law with a heavy tail and a minimum size of 5 Gb · Interactive flows, simulating web traffic or remote com- mands through SSH The reference heavy flow source will be performing bulk data transfers on a regular basis to highlight the impact of the current congestion level on the transfer. The reference interactive flow source will also be used to monitor and show the impact of the congestion level on interactive work. The metric that will be considered here is the transfer completion time, as it is relevant to what the user want: receiving its file in a reasonable time. Three measurements will be performed. The first one cor- responds to the case where our special heavy flow is alone in the system, the second one is the case where the heavy flows are supposed to used an oracle that allow to perform a fair sharing of the bandwidth (each heavy flow strictly get 1/100 th of the bandwidth). Finally, in the third experiment, standard TCP will be used to handle the congestion.




  • Nodes involved: 100
  • Sites involved: >6
  • Minimum walltime: >1d
  • Use kadeploy: yes
  • CPU bound: no
  • Memory bound: no
  • Storage bound: no
  • Network bound: yes
  • Interlink bound: yes

Tools used

NXE, iperf


Not yet

Shared by: Romaric Guillier, Pascale Primet
Last update: 2009-02-05 11:09:05
Experiment #452

Personal tools

Public Portal
Users Portal
Admin portal
Wiki special pages