Deploying and Using IaaS Clouds on Grid'5000
From Grid5000
Contact for Nimbus:
Contact for OpenNebula:
- Houssem Eddine Chihoub, PhD student, INRIA Rennes - Bretagne Atlantique, KerData team | mail | Webpage
Contents |
Introduction
The goal of this practical session is to introduce users to the deployment and usage of popular IaaS toolkits on Grid'5000. In the first part, users will be presented with a general overview of two IaaS toolkits, Nimbus and OpenNebula. They will learn how to deploy these two systems on Grid'5000, using scripts developed by members of the Grid'5000 community. After interacting with each cloud separately using their native tools, they will provision resources programmatically using the EC2 API.
General overview of Nimbus and OpenNebula
Nimbus and OpenNebula are both Infrastructure-as-a-Service toolkits. They allow users to build computing infrastructures from virtualized resources. In practice, users contact the cloud service and request virtual machines on which they have full administrative access, allowing them to run any kind of software they want. Both Nimbus and OpenNebula are architectured similarly. A frontend node runs the main cloud service. This service receives user requests, schedules virtual machines, does authentication and accounting, etc. Other nodes in the cloud are dedicated to execute virtual machines: we refer to them as hypervisors or VMMs (Virtual Machine Managers).
Nimbus Architecture
As explained above, the main service of Nimbus runs on a dedicated node. Two components implement this service: the IaaS service, which answers user requests to create and terminate virtual machines, and the Cumulus storage service, which handles all requests related to virtual machine images (download & upload). Users interact with these two components using HTTP. The Nimbus IaaS service exposes its own web service API, or an Amazon EC2-compatible API. Cumulus implements the Amazon S3 API.
VMM nodes run a program called workspace-control, which interacts with the libvirt library to manage Xen and KVM virtual machines. They also run a DHCP server managed by Nimbus in order to serve IP addresses to virtual machines.
OpenNebula Architecture
OpenNebula provides a set of Tools. These are management tools developed using the interfaces provided by the OpenNebula Core, such as the CLI (command line interface), the scheduler, the libvirt API implementation or the Cloud RESTful interfaces.
The OpenNebula Core, which include the main virtual machine, storage, virtual network and host management components, controls and monitors virtual machines, virtual networks, storage and hosts. The core performs its actions (e.g. monitor a host, or cancel a VM) by invoking a suitable driver.
Drivers, are used to plug-in different virtualization, storage and monitoring technologies and Cloud services into the core.
Grid'5000 specificities
When we deploy Nimbus and OpenNebula on Grid'5000, we are in charge of configuring the IP network used by virtual machines. We are not using the network isolation capabilities of Grid'5000, so we have to make sure all simultaneous cloud deployments use different IP networks. To solve this problem, a subnet reservation system is available. We can reserve one or several IP networks, ranging from /22 to /16 subnets, allocated inside the Grid'5000 virtual network.
Nimbus
Nimbus deployment
First, we connect to a frontend, for instance Nancy :
$ ssh nancy
We install the Ruby gems required by the Nimbus deployment script:
$ gem install gosen json net-ssh net-scp restfully
The script also assumes that you have an SSH key internal to Grid'5000, which can be used to connect to the frontend. If it is not the case, create a new SSH key by running the following (don't enter any passphrase, just type Enter):
$ ssh-keygen -q
Enter file in which to save the key (/home/priteau/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Then, make it possible to connect to the frontend using this SSH key:
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
Now that our environment is ready, we do a resource reservation. We ask OAR for 4 nodes with hardware virtualization capabilities, and one /22 subnet (1022 IP addresses).
$oarsub-l {"virtual!='none'"}nodes=4+slash_22=1,walltime=2 -t deploy 'sleep 86400' [ADMISSION RULE] Modify resource description with type constraints [ADMISSION RULE] Modify resource description as subnet Generate a job key... OAR_JOB_ID=381465
Take a note of the OAR job ID that was given to you, in this case 381465.
The Nimbus deployment script supports multiple job reservations. To describe these reservations, it gets its input from a YAML file provided by the user.
We are going to use the script with only one reservation, but we still need to create the file.
Open a text editor and create a file containing this content, replacing highlighted values by the ones matching your job. For the rest of this tutorial, we suppose the file is named jobs.yml and located in your home directory.
--- - hypervisor: kvm uid:381465site:rennes
Now, we are ready to run the deployment script:
$/grid5000/code/staging/nimbus-deploy/deploy_nimbus_cloud.rb~/jobs.yml I, [2011-04-07T14:36:42.910772 #7094] INFO -- : Kadeploy run 1 with 4 nodes (0 already deployed, need 3 more) I, [2011-04-07T14:42:14.814027 #7094] INFO -- : Nodes deployed: paradent-46.rennes.grid5000.fr paradent-48.rennes.grid5000.fr paradent-49.rennes.grid5000.fr paradent-5.rennes.grid5000.fr I, [2011-04-07T14:42:14.814537 #7094] INFO -- : Had to run 1 kadeploy runs, deployed 4 nodes I, [2011-04-07T14:42:28.132897 #7094] INFO -- : Broadcasting node configurations I, [2011-04-07T14:42:30.073163 #7094] INFO -- : Configuring and installing Nimbus (please wait) I, [2011-04-07T14:45:15.575224 #7094] INFO -- : Installation finished Nimbus cloud in rennes: - service node isparadent-46.rennes.grid5000.fr- client node isparadent-48.rennes.grid5000.fr(connect with "ssh nimbus@paradent-48.rennes.grid5000.fr")
The script deploys and configures a Nimbus cloud. When it is finished it prints details about the deployment: what is the service node, and what is the client node where the Nimbus cloud client is installed and ready to use.
Nimbus usage
There are several ways to interact with a Nimbus cloud: the cloud client (a simple and high level client), the reference client (a client offering low level access to the WSRF frontend), and the EC2 API. For this part of the tutorial, we are going to use the cloud client.
This client is installed and pre-configured on a dedicated node, listed at the end of the deployment. First, connect to it using the nimbus account (replace the hostname by the one used in your deployment)
$sshnimbus@paradent-48.rennes.grid5000.fr
The cloud client (version 019) is installed in /nimbus/nimbus-cloud-client-019. Change to this directory:
$ cd nimbus-cloud-client-019
Nimbus relies on the Globus Java WS Core for its service implementation. As such, we need to create a certificate proxy to be authenticated with the service stack. Run the following command:
$ ./bin/grid-proxy-init.sh
Your identity: C=US,ST=Several,L=Locality,O=Example,OU=Operations,CN=Pierre.Riteau@irisa.fr
Creating proxy, please wait...
Proxy verify OK
Your proxy is valid until Fri Apr 08 23:58:05 CEST 2011
We can now use the Nimbus cloud client. First, let's list the virtual machine images available in the Cumulus repository:
$ ./bin/cloud-client.sh --list
No files.
Of course, there is no available images: our cloud is brand new! We are going to upload a KVM image using the Debian squeeze operating system.
$ ./bin/cloud-client.sh --transfer --sourcefile /home/priteau/squeeze-vm.raw
Transferring
- Source: squeeze-vm.raw
- Destination: cumulus://Repo/VMS/620c4a30-90ff-11df-b85b-002332c9497e/squeeze-vm.raw
Preparing the file for transfer:
1255MB [XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX] 100%
Transferring the file:
1255MB [XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX] 100%
Done.
If we list available images again, we can see our image was uploaded successfully.
$ ./bin/cloud-client.sh --list
[Image] 'squeeze-vm.raw' Read/write
Modified: Apr 8 2011 @ 12:00 Size: 1316044800 bytes (~1255 MB)
We are now ready to start our first virtual machine. This is done using the cloud client too:
$./bin/cloud-client.sh--run --hours 2 --name squeeze-vm.raw Launching workspace. Workspace Factory Service: https://parapluie-18.rennes.grid5000.fr:8443/wsrf/services/WorkspaceFactoryService Creating workspace "vm-001"... done. IP address:10.158.12.1Hostname: rennes-vm1 Start time: Fri Apr 08 13:45:16 CEST 2011 Shutdown time: Fri Apr 08 15:45:16 CEST 2011 Termination time: Fri Apr 08 15:55:16 CEST 2011 Waiting for updates. "vm-001" reached target state: Running Running: 'vm-001'
Take note of the VM identifier that was assigned, in this case vm-001. It will be used later to interact with the service.
The cloud client also reports the IP address that was allocated to your VM, in this case 10.158.12.1. We use this IP address to connect to the VM, using SSH.
No password is required since Nimbus takes care of uploading an SSH key during the VM provisioning.
$sshroot@10.158.12.1Warning: Permanently added '10.158.12.1' (RSA) to the list of known hosts. Linux nimbusvm-1 2.6.26-2-amd64 #1 SMP Wed Aug 19 22:33:18 UTC 2009 x86_64 The programs included with the Debian GNU/Linux system are free software; the exact distribution terms for each program are described in the individual files in /usr/share/doc/*/copyright. Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. Last login: Tue Mar 30 15:24:21 2010 from 10.156.0.1
We are now connected to a VM with full administrative access: we can install packages, run any applications, etc.
To destroy the VM, we run the following command:
$./bin/cloud-client.sh--terminate --handlevm-001Terminating workspace. - Workspace handle (EPR): '/nimbus/nimbus-cloud-client-019/history/vm-001/vw-epr.xml' Destroying vm-001... destroyed.
OpenNebula
OpenNebula deployment
The scripts to automatically deploy OpenNebula using the provided images can be found here:
/home/priteau/bsScripts/
The content of the openNebula directory is the following:
- config: directory that contains the configuration scripts for deploying openNebula.
- env.sh: script with all the global variables necessary for OpenNebula deployment.
In the config directory you will find all the necessary scripts for deploying both OpenNebula Frontend and nodes. The main script is auto-config.sh and it will be called by the launching script.
The ONConfigureFrontend.sh, ONConfigureCloud.sh, and ONConfigureNode.sh will copy and then call internal scripts on and from the newly deployed ON Frontend and ON nodes in order to make the necessary settings. The configuration of your SSH keys is needed for this operation.
Do a reservation on G5K
Copy the scripts in your home directory:
$ cp -R /home/priteau/bsScripts .
Reserve nodes with hardware virtualization capabilities, and one /22 subnet (1022 IP addresses):
$ oarsub -l {"virtual!='none'"}nodes=3+slash_22=1,walltime=2 -t deploy 'sleep 86400'
[ADMISSION RULE] Modify resource description with type constraints
[ADMISSION RULE] Modify resource description as subnet
Generate a job key...
OAR_JOB_ID=1186545
Connect to your job:
$ oarsub -C OAR_JOB_ID
Specify the number of ON nodes (this does not include the ON frontend) in the global env.sh of OpenNebula scripts (in the directory ~/bsScripts/openNebula). If you reserved and got exactly three nodes, you don't need to change anything:
# The number of machines that need to be deployed export NO_ON_NODES=2
Run the configuration and deploy scripts
Go to the directory ~/bsScripts/launchDepl and run the config-all.sh script:
$ sh config_all.sh
Your cloud should be deployed in a few minutes!
OpenNebula usage
Further configuration
Configuration scripts will be copied and run on the newly deployed images, you can find them in the following directories:
- /srv/cloud/on_utils on the OpenNebula Frontend image (there is a script for configuring the Frontend, doing the necessary settings and another one for configuring and setting the cloud).
- /srv/cloud/on_utils on the OpenNebula Node image.
You can now login to your opennebula cloud frontend :
$ ssh oneadmin@frontend
Build the Cloud Virtual Network
On the oneadmin home on the frontend:
$ cd $HOME/one-templates
You can find OpenNebula templates (configuration files) for the virtual network, the virtual machine image, and the virtual machine.
There are many ways to build a virtual network in OpenNebula (for more information please see [1]). Our virtual network uses fixed leases. The template file (on-network.net) should contain something similar to the following content:
NAME = "on-network" TYPE = FIXED #Now we'll use the cluster private network (physical) BRIDGE = br0 LEASES = [ IP="10.156.0.51"] LEASES = [ IP="10.x.x.x"] LEASES = [ IP="10.x.x.x"] ...
Note here that the IP adresses are gotten from the subnet reservation and written in the template file.
The NAME is the name of the fixed virtual network, the TYPE is 'fixed' in our case. The bridge br0 is the brigde configured on OpenNebula node. The LEASES are a list of IPs to be assigned to the deployed virtual machines.
Create the virtual network :
$ onevnet create $HOME/one-templates/on-network.net
The output of onevnet list should look like this:
ID USER NAME TYPE BRIDGE P #LEASES 0 oneadmin on-network Fixed br0 N 0
Deploy a virtual machine on our OpenNebula cloud
Now we want to deploy an instance of our virtual machine image "squeeze-vm". The image is already registered in the OpenNebula image repository. The template for squeeze-vm is squeeze-vm.one and looks something like :
NAME = squeeze-vm CPU = 2 MEMORY = 512 DISK = [ image = "squeeze-vm-image" ] NIC = [ NETWORK = "OpenNebula network" ] FEATURES=[ acpi="no" ] CONTEXT = [ hostname = "$NAME", ip_public = "$NIC[IP, NETWORK=\"Opennebula network\"]", files = "/srv/cloud/one/one-templates/init.sh /srv/cloud/one/one-templates/id_rsa.pub /srv/cloud/one/one-templates/test.txt", target = "hdc", root_pubkey = "id_rsa.pub", username = "opennebula", user_pubkey = "id_rsa.pub" ]
The variables provide informations about the virtual machine in the OpenNebula Cloud context (name, CPU used, maximum memory, from where to take the image or what virtual network to use). For more information about the template file setting, please see [2].
Create and deploy a squeeze-vm virtual machine:
$ onevm create $HOME/one-templates/squeeze-vm.one
The output of onevm list should look like this:
ID USER NAME STAT CPU MEM HOSTNAME TIME 9 oneadmin squeeze-vm runn 0 0K paradent-2 00 00:00:24
That means we have a running squeeze-vm virtual machine on host paradent-2.
To show more information about the virtual machine run:
$ onevm show ID
Among these informations note the IP address! You can access your virtual machine by simply running (the password should be grid5000):
$ ssh root@IP-address-of-the-VM
To delete the VM:
$ onevm delete ID
Connect to your virtual machine using vnc client
Add the following line in the template (squeeze-vm.one)
GRAPHICS = [ type = "vnc", listen = "0.0.0.0", port = "5938"]
Create and deploy a new squeeze-vm virtual machine instance:
$ onevm create $HOME/one-templates/squeeze-vm.one
The output of onevm list should look like this:
ID USER NAME STAT CPU MEM HOSTNAME TIME 3 oneadmin squeeze-vm runn 0 512M griffon-38.nanc 00 00:11:04
It means that we have a running squeeze-vm virtual machine on host griffon-38.nancy.grid5000.fr.
To connect to your VM using your local vnc client you have to establish a ssh tunnel. So, from your local machine type
$ ssh -L 15938:griffon-38.nancy.grid5000.fr:5938 access.lyon.grid5000.fr
Now you could connect to your VM using vnc client on localhost:15938
Using clouds programmatically
To use clouds without manual interaction with the system, we can take advantage of APIs offered by clouds. In this section, as an example, we are going to script the deployment of a cluster of machines on Nimbus, using the EC2 API.
The boto library
boto is a Python library to easily access the AWS APIs, including EC2. First, we need to install it on a frontend:
Installing boto
$ mkdir -p $HOME/local/lib/python2.5/site-packages/ $ tar xzf /home/priteau/boto-2.0b4.tar.gz $ cd boto-2.0b4 $ export PYTHONPATH=$HOME/local/lib/python2.5/site-packages/ $ python setup.py install --prefix=$HOME/local
Writing a boto script to access Nimbus
Connecting to the cloud
To connect to the cloud, we set up our credentials and initiate connection objects:
access_id = "vIKOIQa5tduXorYRxctyh" access_secret = "TOZcDmPNFYPii8UTcGXrGiRaW4RNfzm2TcTvqZaNCz" canonical_id = "620c4a30-90ff-11df-b85b-002332c9497e" host = options.service port = 8444 cumulusport = 8888 region = RegionInfo(name="nimbus", endpoint=host) ec2conn = boto.connect_ec2(access_id, access_secret, region=region, port=port) cf = OrdinaryCallingFormat() s3conn = S3Connection(access_id, access_secret, host=host, port=cumulusport, is_secure=False, calling_format=cf)
Uploading a VM image to the image repository
This code shows how to upload a VM image to an S3-compatible image repository such as Cumulus:
# upload the file according to the Nimbus naming convention
bucket = s3conn.get_bucket("Repo")
print "Uploading the VM image to Cumulus"
k = boto.s3.key.Key(bucket)
k.key = "VMS/%s/%s" % (canonical_id, 'squeeze-vm.raw')
if not k.exists():
k.set_contents_from_filename("/home/priteau/%s" % 'squeeze-vm.raw')
Uploading a public key to the cloud
We can also upload a private key to be able to connect to the VMs without password:
# Upload our SSH public key to the cloud
f = open(os.path.expanduser('~/.ssh/id_rsa.pub'), 'r')
ssh_public_key = f.read()
f.close()
ec2conn.create_key_pair("gsg-keypair||%s" % ssh_public_key)
Starting a VM cluster
Finally, we can start a VM cluster (options.number is the number of VMs to run):
# now run the VMs
reservation_nodes = ec2conn.run_instances('squeeze-vm.raw', min_count=options.number, max_count=options.number, key_name="gsg-keypair")
instances_nodes = reservation_nodes.instances
The full script
Now that we've seen how to interact with the EC2 API, let's try a full example. Copy the following script to ~/ec2.py. Don't forget to run chmod +x ~/ec2.py.
Then you can run it using the following command (replace the node name by your own Nimbus service node):
~/ec2.py -sSERVICE_NODE_NAME-n2
The -n option specifies the number of VMs to start.
#!/usr/bin/env python
import boto
from boto.s3.key import Key
from boto.s3.connection import OrdinaryCallingFormat
from boto.s3.connection import S3Connection
from boto.ec2.connection import EC2Connection
from boto.ec2.regioninfo import RegionInfo
import os, sys
from optparse import OptionParser
import shlex, subprocess
import tempfile
import time
parser = OptionParser()
parser.add_option("-s", "--service", dest="service",
help="hostname of the Nimbus service", metavar="HOST")
parser.add_option("-n", "--number", dest="number",
help="number of VMs to deploy", metavar="NUMBER")
(options, args) = parser.parse_args()
if not options.service or not options.number:
print "usage: ec2.py --service HOST --number NUMBER"
sys.exit(1)
access_id = "vIKOIQa5tduXorYRxctyh"
access_secret = "TOZcDmPNFYPii8UTcGXrGiRaW4RNfzm2TcTvqZaNCz"
canonical_id = "620c4a30-90ff-11df-b85b-002332c9497e"
host = options.service
port = 8444
cumulusport = 8888
region = RegionInfo(name="nimbus", endpoint=host)
ec2conn = boto.connect_ec2(access_id, access_secret, region=region, port=port)
cf = OrdinaryCallingFormat()
s3conn = S3Connection(access_id, access_secret, host=host, port=cumulusport, is_secure=False, calling_format=cf)
# upload the file according to the Nimbus naming convention
bucket = s3conn.get_bucket("Repo")
print "Uploading the VM image to Cumulus"
k = boto.s3.key.Key(bucket)
k.key = "VMS/%s/%s" % (canonical_id, 'squeeze-vm.raw')
if not k.exists():
k.set_contents_from_filename("/home/priteau/%s" % 'squeeze-vm.raw')
# Upload our SSH public key to the cloud
f = open(os.path.expanduser('~/.ssh/id_rsa.pub'), 'r')
ssh_public_key = f.read()
f.close()
ec2conn.create_key_pair("gsg-keypair||%s" % ssh_public_key)
# now run the VMs
reservation_nodes = ec2conn.run_instances('squeeze-vm.raw', min_count=options.number, max_count=options.number, key_name="gsg-keypair")
instances_nodes = reservation_nodes.instances
for instance in instances_nodes:
instance.update()
while instance.state == 'pending':
print 'instance is %s' % instance.state
time.sleep(10)
instance.update()
if instance.state != 'running':
print 'error: instance is in state %s' % instance.state
sys.exit(1)
for instance in instances_nodes:
print instance.ip_address
nodes = tempfile.mkstemp()
f = os.fdopen(nodes[0], 'w')
for instance in instances_nodes:
f.write("%s\n" % instance.ip_address.encode())
f.close()
print "Sleeping for 60 seconds for the VMs to boot"
time.sleep(60)
