Put Some Green In Your Experiments
From Grid5000
This practical session requires five nodes of Lyon site. Three of them will use a typical Debian image and the two remaining will use a Xen image.
The form for this practical session is here: Fiche TP
Contributors:
- Marcos Dias de Assunção, INRIA Post-Doc at RESO Inria TEAM, ENS Lyon | mail | Webpage
- Mohammed El Mehdi Diouri, PhD student, ENS Lyon, RESO Team | mail | Webpage
- Laurent Lefèvre, INRIA Researcher, RESO Team, ENS Lyon | mail | Webpage
- Olivier Mornard, Inria Engineer, RESO Team, ENS Lyon | mail | Webpage
- Anne-Cécile Orgerie, PhD student, ENS Lyon, RESO Team | mail | Webpage
- Ghislain Landry Tsafack Chetsa, PhD student, ENS Lyon, RESO Team | mail | Webpage
Contents |
Introduction
This practical session is focused on energy aspects in Grid'5000. This session is based on the 150 energy sensors (wattmeters) infrastructure provided by Grid'5000/Aladdin and recently deployed on Grid'5000 Lyon site and some software framework provided by the GREEN-NET ARC project.
The session will be divided into three main parts:
- Energy usage exposing
- the goal is to learn how to observe and collect energy information of basic Grid'5000 node operations (i.e. deployment of a node, intensive CPU usage, disk access and high performance communication).
- Energy Efficient HPC
- this aims to experiment and to see the energy aspects of some MPI programming design and choices.
- Virtualization and Energy
- the purpose is to analyze the energy impact of virtualization and of virtual machines mapping choices with a focus on live migration.
Energy Usage Exposing
Image Deployment
First, you need to log on Lyon site:
sshlogin@access.lyon.grid5000.frsshfrontend
Then, you should reserve five nodes:
oarsub-t deploy -p"host='sagittaire-80.lyon.grid5000.fr'||host='sagittaire-81.lyon.grid5000.fr'||host='sagittaire-82.lyon.grid5000.fr'||host='capricorne-57.lyon.grid5000.fr'||host='capricorne-58.lyon.grid5000.fr'"-r "2011-04-19 16:30:00" -l nodes=5,walltime=2
where you replace sagittaire-80 and the other ones by your nodes.
As usual, you need to connect to the job:
oarsub-Cyour_jobid
where your_jobid is the job id given by OAR for the reservation.
Then you can have the list of the reserved nodes in a file named machines with the following command:
cat$OAR_NODEFILE |uniq> machines
| Note | |
|---|---|
Help for the reservation can be found here: OAR | |
Then, for the beginning, three nodes are required. You need to deploy TPG5K-green-image1 environment on this three nodes. We assume that you have a file named machines1 with the list of nodes.
It's time to look at the machine time (with the date command) and to write it somewhere.
Then, you can launch the deployment by using the following command
kadeploy3-eTPG5K-green-image1-u memdiouri -fmachines1-k
When the deployment is finished, you can log on the nodes:
sshroot@node
| Note | |
|---|---|
Help for the deployment can be found here: Deploy_environment | |
Access to Energy Information
The webpage that gives all the energy information is:
This first page explains how this energy tool works. It can be useful to read it!
If you click on the 'Live Monitoring' tab, you will see one RRD graph per node. You can click on the graph concerning one of your nodes, and you will access the different provided views: hour, day, week, month and year view. Those graphs are updated every minute.
On the 'Logs Access' tab, a log on demand service is provided: you need to fill the start and end time wanted, the nodes and the increment which represents the time period between two measures that you will obtain (in seconds). You will obtain a directory (with a permanent address) which will contain:
- a log file for each requested node (.dat extension),
- an svg graph for each requested node,
- a log file with the global consumption of each requested node during the requested time period (.log extension) and
- a tar.gz file containing all the previous files.
Now, it's time to test it (please, ask only for your node and for short periods of time; this infrastructure is still under development)! You can put the start time you have just written at the beginning of the deployment, and you check the boxes corresponding to your three nodes.
| Question | |
|---|---|
Does the boot make the energy consumption increase? If yes, can you see the different steps of the kadeploy deployment? If no, is this normal? | |
Basic Operations
The goal here is to observe the energy consumption of basic operations of Grid'5000 applications. In order to do this, you can use the live monitoring tools or the logs on demand tools tested just before.
Idle Nodes
We will call this value the idle consumption (i.e. the consumption when the node is initialized but idle). You can launch a top command to verify that your nodes are idle.
Disk Access Intensive Application
Go to directory /home/user/basic/. For each experiment you will launch in this section, a logs file will be created and will contain the start time of your experiment.
To perform a disk intensive application, you can make your own disk intensive application or launch the following command on the nodes:
./hdparm.sh 120 &
where '120' is the length of the experiment in seconds. This application uses a loop with the hdparm command.
CPU Intensive Applications
CPU is considered as the most energy consuming component of computing nodes.
You can launch a cpu intensive application on the nodes with following command:
./cpuburn.sh 120 &
where '120' is the length of the experiment in seconds.
| Warning | |
|---|---|
This application uses cpuburn (burnP6) which is not so good for the nodes, so please, do not over-use it! | |
Another cpu intensive application is provided. It can be launched on the nodes with the following command:
./stress.sh 120 &
where '120' is the length of the experiment in seconds.
You can test with your own cpu intensive application.
Network Intensive Application
To launch the network intensive application, you need two nodes: one client and one server. On the server, the command is:
./iperf-server-UDP.sh 120 &
where '120' is the length of the experiment in seconds. This experiment uses iperf with UDP traffic.
And then quickly, you launch on the client the following command:
./iperf-client-UDP.sh120serveraddress&
where '120' is the length of the experiment in seconds and serveraddress is either the IP address of the server node or its full name (sagittaire-80.lyon.grid5000.fr for example).
| Note | |
|---|---|
You can fix the asked bandwidth in iperf-client-UDP.sh. It's the value after the -b option. By default, we have set it to 1Gb/s. | |
| Question | |
|---|---|
Is there a difference in terms of energy consumption between the client and the server (in percentage)? | |
You can test the consumption with TCP traffic with the following command on the server:
./iperf-server-TCP.sh 120 &
where '120' is the length of the experiment in seconds.
And then quickly, you launch on the client the following command:
./iperf-client-TCP.sh120serveraddress&
where '120' is the length of the experiment in seconds and serveraddress is either the IP address of the server node or its full name (sagittaire-80.lyon.grid5000.fr for example).
| Question | |
|---|---|
Is there a difference in terms of used bandwidth between these two experiments (with UDP and TCP traffic)? | |
| Note | |
|---|---|
You can see this information in the logs file (the output of the iperf commands are re-directed in this file). | |
| Question | |
|---|---|
Is there a difference in terms of energy between these two experiments (with UDP and TCP traffic)? Where does it come from? | |
First Summary
Energy Efficient HPC
MPI Experiments
A programmer designs an MPI client-server application as presented in Figure 1. Three processes run concurrently:
- 1 server is giving tasks to 2 clients,
- client 1 is computing small tasks (S Tasks),
- client 2 is computing XL Tasks (1 XL task = 2 S tasks).
MPI Scenario n°1: dumb application
Go to /home/user/mpi/ directory.
The programmer is not very good and produces a first version of the MPI application : version_dumb.c.
Have a look on the program in order to understand what the programmer has done.
Compile your dumb application with mpicc on each node:
mpicc-oversion_dumbversion_dumb.c
Launch your MPI application locally (on the server for example):
mpirun-np 3version_dumb
Copy the machines1 file on the server in /home/user/mpi/ and run your application on the 3 different machines:
mpirun-np 3 -machinefilemachines1version_dumb
MPI Scenario n°2: a greener MPI program
OK, you don't know how to improve your MPI application, have a look on the version_top.c MPI program (in /home/user/mpi/).
As previously, you need to compile the top version on each node:
mpicc-oversion_topversion_top.c
Then, you can launch it on the three nodes:
mpirun-np 3 -machinefilemachines1version_top
| Question | |
|---|---|
What is the impact of your MPI modifications on the server in terms of energy? Impact on clients' consumption? | |
Virtualization and Energy
For the experiments described in this section, you will need two nodes.
Image Deployment
The first step is to deploy the Grid'5000 environment that contains the Xen hypervisor (i.e. the disk image with Dom0). Assuming that you have a file named machines2 with the list of nodes, type the following command:
kadeploy3-eTPG5K-green-image2-fmachines2-u memdiouri -k
Second, you will have to customise the environment a little in both machines. Log into each machine:
sshnode
After that, logged as root, you will have to give yourself rights to execute the sm command as sudo. That is, you need to add "youruser ALL=NOPASSWD: /usr/sbin/xm" to the sudoers file (where youruser is your login). You also need to start the xend daemon:
su-visudo/etc/init.d/xendstartexit
At this stage, to run VMs you would need to have the disk images of the guest domains (i.e. domUs). To make your life easier, we have prepared a few guest images that you will deploy. You just need to copy them to a directory termed tp in your user area:
cdmkdirtpcp/home/memdiouri/images/tp/*tp/chmod666tp/*
Before you are able to deploy the VMs, you need to update the disk section in the files containing the descriptions for the guest domains. For example, the disk section of debno01.cfg should look like:
disk=['tap:aio:/home/youruser/tp/debno01.img,sda2,w','tap:aio:/home/youruser/tp/swap01.img,sda1,w']root="/dev/sda2ro"
Once the .cfg files are updated, you are ready to run VMs on the nodes you reserved!
Virtual Machine Cost
We are going to start a VM and observe the energy consumed by the host where the VM executes. Before you run the VM, you might want to write down the time when you started.
Assuming you are logged to one of the Xen nodes and in the tp directory where you have placed the disk images, to start the VM you should execute the following command:
sudo xm createdebno01.cfg
You can check the Xen domains currently in execution by typing:
sudo xm list
There are a few things about which you should be aware when configuring the network interfaces of your VMs. As the purpose of this practical session is not to explain in detail how to use Xen on Grid'5000, we will use the console to connect to our VMs. To connect to our VM, we use:
sudo xm consolevm1
You can login using the demo user and the default Grid'5000 password. We are now going to run cpuburn, described beforehand, for 20 seconds on this VM. Hence, once logged as demo on the VM, execute the following commands:
cd /home/demo/cpuburn
./cpuburn.sh 20
After that, you can log out of the VM (Ctrl+AltGr+]) and shut it down:
sudo xm shutdownvm1
Capping and Pinning on a Virtual Machine
We are now going to start VMs that run a CPU intensive application (i.e. cpuburn) once they are initialised. We are going to observe the effect of changing a few parameters of Xen hypervisor's scheduler, such as throttling the CPU usage. But before you start, do not forget to write down the deployment start time.
To start the first VM, logged in one of the Xen nodes type the following command:
sudo xm createdeblo01.cfg
Now lets change the CPU utilisation cap for this VM. Wait around 10 seconds between each of the commands below:
sudo xm sched-credit-d vm1 -c100sudo xm sched-credit-d vm1 -c70sudo xm sched-credit-d vm1 -c50sudo xm sched-credit-d vm1 -c20sudo xm sched-credit-d vm1 -c100
| Note | |
|---|---|
The -c option sets the cap. For example, with -c 50, your VM will use 50% of the CPU allocated to it. For more information, see [1]. | |
| Question | |
|---|---|
Is it possible to observe the energy consumed by running the VM with a CPU intensive application? | |
Lets start another VM:
sudo xm createdeblo02.cfg
Each VM is assigned one Virtual CPU (VCPU). What if we want to consolidate the workload to save energy and make the two VMs share the same physical CPU? Lets do that by pinning the VCPUs of both VMs to the same physical CPU:
sudo xm vcpu-pinvm1 0 0sudo xm vcpu-pinvm2 0 0
| Question | |
|---|---|
Pinning the VMs to share the same CPU has any impact on the energy consumed by the physical host? | |
Live Migration
So far you have used only one of the nodes to run your VMs. Imagine that for some reason this host becomes overloaded and you need to migrate one of the VMs to another node. We are now going to do that and observe the energy consumed during the migration. Again, we do not mean to be annoying, but it would be nice to write down the time when you start the migration.
To migrate vm1 to second_host, use the following command:
sudo xm migrate-l vm1second_host
You can now shut the VMs down if you wish.
Conclusion
Any comments and feedbacks are welcome!
Resources
- GREEN-NET ARC project
- RRDtool Round Robin Database
- xenwiki/CreditScheduler Xen Credit Scheduler (for capping)
