Grid'5000 user report for

Jump to: navigation, search

User information

( user)
More user information in the user management interface.

Experiments

  • Stress of 10G interconnection link (Networking) [achieved]
    Description: The goal of this experiment is to stress the 10 GbE links of Grid'5000. The experiments involved two sites in a time. We create TCP flows between 9 to 13 nodes on each cluster choosen among azur, parasol, grillon. We mesurate the average obtained bandwith. The experiments are running from 5 min to 1 hour.
    Results: At the 22th of August 11:47 to 12:47, we obtain an average bandwith of 8,7 Gb/s with 1 hour experiment. The maximum bandwith was 9.399 Gb/s. We used default configuration i.e. kernel 2.6.12 with Bic TCP stack.
    We can't obtain such results with azur due to network topology on this site.

    We have shown that performance communications in grid (and also in Grid'5000) are mainly based on TCP performance, that's why we need to tune TCP parameters for Grid'5000 as described in our research report. This parameters have been implemented now on each site and Grid'5000 users can use maximum bandwith between sites.
    More information here
  • Checking the maximal achievable TCP throughput on a Grid5000 site (Networking) [achieved]
    Description: The goal of this experiment was to check that the various TCP stacks (Reno, BIC, CUBIC, HighSpeed-TCP, Hybla, Hamilton-TCP, Layered-TCP, Vegas, Westwood+) provided by the Web100 patches on a 2.6.16 Debian-patched GNU/Linux kernel are able to achieve the maximum throughput available in a local context with no cross traffic.
    It consists in having two nodes on a site on the same switch and establishing a TCP connexion between both of them using iperf processes. The buffer size (write and receive) used by iperf was 4M. Each experiment lasted 200 seconds. Both sides were always using the same TCP stack.

    The TCP stack configuration used was the following:

    net.core.rmem_max=107374182
    net.core.wmem_max=107374182
    net.core.rmem_default=107374182
    net.core.wmem_default=107374182
    net.core.optmem_max=107374182
    net.ipv4.tcp_rmem=107374182 107374182 107374182
    net.ipv4.tcp_wmem=107374182 107374182 107374182
    net.ipv4.tcp_mem=107374182 107374182 107374182

    net.core.netdev_max_backlog=1000
    net.ipv4.route.flush=1
    net.ipv4.tcp_no_metrics_save=1

    txqueulen 1000

    Results: On the tests performed in Sophia, on the same switch, it appeared that all the TCP stacks used were able to achieve a 941 Mbps average throughput, except for the Vegas which only achieved 340 Mbps average throughput (modifying the txqueuelen parameter didn't change anything)
  • Transfert time of large files over Grid5000 (Networking) [achieved]
    Description: In this experiment, we perform mem-to-mem transfert of large files (typically 30Go) simultaneously over 1Gbps and 10Gbps links in Grid5000 in order to assess the impact of several parameters on the predictability and reliability of the maximal completion time of these transferts.
    We are studying several parameters along:
    - congestion level ( 0 to 100%, 1 to 20 flows)
    - starting time (0 to 5s)
    - background trafic (CBR, VBR poisson pareto, vbul...)
    - reverse traffic level (0% to 100%)
    - parallel flows (1, 5 or 10)

    Results:
  • Investigation of Ethernet switches behavior in presence of contending flows at very high-speed (Networking) [achieved]
    Description: This work examines the interactions between layer 2 (Ethernet) switches and TCP in high bandwidth delay product networks. First, the behavior of a range of Ethernet switches when two long lived connections compete for the same output port is investigated. Then, we explored the impact of these behaviors on TCP protocol in long and fast networks (LFNs). Several conditions in which scheduling mechanisms introduce heavy unfair bandwidth sharing and loss burst which impact TCP performance are shown. Packet scheduling algorithms for Ethernet equipments have been designed for heterogeneous traffics and highly multiplexed environments. Nowadays Ethernet switches are also used in situations where these assumptions can be incorrect such as grid environments. We plan to pursue this investigation of layer two - layer four interaction and explore how to model it and better adapt control algorithms to fit the new applications requirements. We also plan to do the same precise measurements with flow control (802.3x) and sender-based software pacing enabled which both tend to avoid queue overflows.
    Results: This work shows several conditions in which these scheduling mechanisms introduce heavy unfairness (or starvation) on large intervals (300 ms) and loss burst which impact TCP performance. These conditions correspond to situations where huge data movements occur simultaneously. It also shows that behaviors are different from switch to switch and not easily predictable. These observations offer some tracks to better understand layer interactions. They may explain some congestion collapse situations and why and how parallel transfers mixing packets of different connections take advantages over single stream transfers.
    More information here
  • Congestion Collapse in Grid'5000 (Networking) [achieved]
    Description: Call for Demos: Congestion control train-wreck workshop at Stanford: Call for Demos We would like to invite you to a two-day congestion control workshop, titled "Train-wrecks and congestion control: What happens if we do nothing?", which we are hosting at Stanford on 31 March - 1 April, 2008. Spurred on by a widespread belief that TCP is showing its age and needs replacing - and a deeper understanding of the dynamics of congestion control - the research community has brought forward many new congestion control algorithms. There has been lots of debate about the relative merits and demerits of the new schemes; and a standardization effort is under way in the IETF. But before the next congestion control mechanism is deployed, it will need to be deployed widely in operating systems and - in some cases - in switches and routers too. This will be a long road, requiring the buy-in of many people: Researchers, product developers and business leaders too. Our own experience of proposing new congestion control algorithms has been met with the challenge: "Show me the compelling need for a new congestion control mechanisms?", and "What will really happen to the Internet (and my business) if we keep TCP just the way it is?" As a community, we need examples that are simple to understand, and demonstrate a compelling need for change. We call them the "Train wreck scenarios". Examples might show that distribution of video over wireless in the home will come to a halt without new algorithms. Or that P2P traffic will bring the whole network crashing down. Or that huge, high-performance data-centers need new algorithms. Whatever your favorite example, we believe that if we are collectively armed with a handful of mutually agreed examples, it will be much easier to make a business case for change. Or put another way, if we can't articulate compelling examples to industry leaders, then is the cost and risk of change worth it? The goal of the workshop is to identify a handful of really compelling demonstrations of the impending train-wreck. The outcome will be a set of canonical examples that we will use to persuade industry of the need for change. We are deliberately inviting you many months ahead of time - to give you time to create your compelling train-wreck demonstration. You can choose the way you present your demonstration: You could bring equipment and show a live-demo; you could show simulations or animations; or you could produce a video showing a real or synthetic demo. Whatever method you choose, the goal is to create a case that will persuade a mildly-technical but influential business leader of the need for change. We will invite a panel of judges to give prizes for the most compelling examples in two categories: (1) The Overall Most Compelling Example, which will be judged on a combination of the technical merits and the presentation of the scenario, and (2) The Most Technically Compelling Example, which will be judged on its technical merit alone, without consideration of the way it is presented. The whole purpose of the workshop it to focus on the {\em problem}, not the solutions. We are most definitely {\em not} interested in your favorite scheme, or ours. So we need some ground-rules. \begin{center} {\em No-one is allowed to mention a specific mechanism, algorithm or proposal at any time during the workshop: Not in their talk, not in a panel, and not in questions to the speakers. The only mechanisms that will be allowed mention are: TCP (in its standard and deployed flavors), and idealized alternatives for purposes of demonstration. For example, comparing TCP with an oracle that provides instantaneous optimal rates to each flow.} \end{center} Attendance to the workshop will be by invitation only - to keep the discussion focused and lively. We will video the entire workshop and all the demonstrations, and make it publicly available on the Internet. We will make any proceedings and talks available too. The goal is to open up the demonstrations for public scrutiny and feedback after the event. The event is hosted by the Stanford Clean Slate Program - http://cleanslate.stanford.edu - and local arrangements will be made by Nick McKeown and Nandita Dukkipati. The workshop has received offers of support and funding from Cisco Systems and Microsoft. We hope to make a limited number of travel grants available. We have a small Program Committee (listed below) to select the final demonstrations. We are soliciting a 1-page description of your demo. Please note the following dates. Important Dates: =============== Workshop: 31 March - 1 April, 2008 1-page demo description submission deadline: 1 Dec, 2007 Notification of acceptance: 15 Dec, 2007 Final demo submissions: 17 March, 2008 Last date for demo withdrawals: 25 March, 2008 Please email your initial 1-page submissions to: Nandita Dukkipati and Nick McKeown Best wishes, and we hope to see you in Stanford! Nandita Dukkipati Nick McKeown Program Committee: ================== 1. Albert Greenberg, Microsoft Research 2. Peter Key, Microsoft Research 3. Flavio Bonomi, Cisco 4. Bruce Davie, Cisco 5. Steven Low, Caltech 6. Frank Kelly, Cambridge 7. Lars Eggert, Nokia Research/IETF 8. Nick McKeown, Stanford 9. Guru Parulkar, Stanford 10. Nandita Dukkipati, Stanford/Cisco/Princeton --------------------------------------------------------------------------------------------- * Context and objectives Today, the capacity and the usage of the Internet is funda- mentally changing. In the forthcoming year, millions homes will have access to the Internet with fibber lines. Network companies will offer speeds on fibber of up to 1 Gbps. Con- sequently ultra-high-speed applications will be enabled for low-cost, such as high-definition teleconferencing, telemedicine and advanced telecommuting for people working from home. We then propose a demonstration which aims at showing that the traditional congestion control approach may present a strong barrier to the large scale deployment of these excit- ing and very useful envisioned applications exploiting of the emerging huge access capacities. * Scenario description Within this scenario, we plan to show that without any special mechanisms to script the organisation of these trans- fers, the low aggregation level that could be found in this kind of networks quickly lead to a huge congestion level, disrupting the bulk data transfers and deeply impacting all interactive traffic. We will use a hierarchical dumbbell, with 10 Gbps bottle- necks as seen in Figure 1. 100 independent sources in a real testbed will generate bulk data transfers. Other sources will create interactive sessions in both directions. All sources will be using 1 Gbps Ethernet network cards, which means that the aggregation level will be 10 for the local level and 1 for the Internet level. As an addition, we plan to use an hardware latency emulator to simulate long distance links. Two kinds of workloads will be injected: · Long running heavy flows, simulating input data needed from distributed computing, whose sizes are following a distribution law with a heavy tail and a minimum size of 5 Gb · Interactive flows, simulating web traffic or remote com- mands through SSH The reference heavy flow source will be performing bulk data transfers on a regular basis to highlight the impact of the current congestion level on the transfer. The reference interactive flow source will also be used to monitor and show the impact of the congestion level on interactive work. The metric that will be considered here is the transfer completion time, as it is relevant to what the user want: receiving its file in a reasonable time. Three measurements will be performed. The first one cor- responds to the case where our special heavy flow is alone in the system, the second one is the case where the heavy flows are supposed to used an oracle that allow to perform a fair sharing of the bandwidth (each heavy flow strictly get 1/100 th of the bandwidth). Finally, in the third experiment, standard TCP will be used to handle the congestion.
    Results:
  • Comparison of the performance of two transport protocols (Networking) [achieved]
    Description: Comparison of UDT and a TCP without congestion control for the transfer of large files with gridftp following a given bandwidth reservation profile.
    Results:

Publications

Collaborations

    Success stories and benefits from Grid'5000

    last update: 2009-02-05 11:09:38

    Personal tools
    Namespaces

    Variants
    Views
    Actions
    Public Portal
    Users Portal
    Admin portal
    Wiki special pages
    Toolbox