Creating Customized Grid'5000 Environments with Chef

From Grid5000

Jump to: navigation, search


Contents

Introduction

The goal of this practical session is to introduce Grid'5000 users to Chef, an open source configuration management tool. We will present how Chef can be used to describe and automate large distributed software installations typically used on Grid'5000.

The basic utilization of Chef involves applying a set of cookbooks to nodes. A cookbook is a description of the installation and configuration of a software tool, written in Ruby using resources offered by the Chef framework. A node configuration is described as a JSON entry which is interpreted by Chef to modify software installed on a node in order to match the required configuration.

Installing Chef

First, we reserve three nodes in deploy mode on Grid'5000:

$ oarsub -l nodes=3,walltime=2 -t deploy 'sleep 86400'

We connect to the job (replace with your own job ID):

$ oarsub -C OAR_JOB_ID

Then, we deploy a squeeze image on the first node:

$ kadeploy3 -e squeeze-x64-nfs -m `head -1 $OAR_NODE_FILE` -k

Let's connect to the first node:

$ ssh root@`head -1 $OAR_NODE_FILE`

To install Chef, we first need to install RubyGems:

# aptitude update
# aptitude install ruby ruby-dev libopenssl-ruby rdoc ri irb build-essential wget ssl-cert  # These are the required dependencies for RubyGems 
# cd /tmp/
# env http_proxy=http://proxy:3128 wget http://production.cf.rubygems.org/rubygems/rubygems-1.7.2.tgz
# tar xzf rubygems-1.7.2.tgz
# cd rubygems-1.7.2
# ruby setup.rb
# ln -s /usr/bin/gem1.8 /usr/bin/gem

Now that RubyGems is installed, we can install Chef:

# env http_proxy=http://proxy:3128 gem install chef --no-rdoc --no-ri

Finally, we configure it:

# mkdir -p /etc/chef /tmp/chef-solo

Create a /etc/chef/solo.rb file containing the following content:

cookbook_path "/tmp/chef-solo/cookbooks"
file_cache_path "/tmp/chef-solo"

Our image is now ready to be used with Chef. Let's save it. From the frontend:

$ mkdir -p ~/public/{descriptions,images}
$ ssh root@`head -1 $OAR_NODE_FILE` tgz-g5k > ~/public/images/squeeze-x64-chef.tgz

tgz-g5k creates an archive of the modified environment, which takes a couple of minutes. Next, we create the environment description, replacing the highlighted values. Save it in ~/public/descriptions/squeeze-x64-chef.dsc

name : squeeze-x64-chef
version : 1
description : squeeze-x64-nfs with Chef
author : $USER
tarball : http://public.$SITE.grid5000.fr/~$USER/images/squeeze-x64-chef.tgz|tgz
postinstall : /grid5000/postinstalls/debian-x64-nfs-2.3-post.tgz|tgz|traitement.ash /rambin 
kernel : /vmlinuz
initrd : /initrd.img
fdisktype : 83
filesystem : ext3
environment_kind : linux
visibility : shared
demolishing_env : 0

Now, we deploy this new environment on the two other nodes we reserved:

$ kadeploy3 -a http://public.$SITE.grid5000.fr/~$USER/descriptions/squeeze-x64-chef.dsc -m second-node.$SITE.grid5000.fr -m third-node.$SITE.grid5000.fr -k

You can let this deployment run in the background and move to the next section.

Using Chef

Now that our environment is ready, we are going to use Chef for the first time. Chef can be run in two modes: a client/server model and a stand-alone model (chef-solo). In this tutorial, we only use chef-solo.

The chef-solo program takes as arguments a node configuration in JSON format describing the list of recipes to install, and a cookbook tarball containing the recipes.

Still on the first node of our reservation, we create the JSON file containing our node configuration. Let's name it /tmp/node.json:

{
  "run_list": [ "recipe[java_sun]" ]
}

This tells Chef to install the java_sun recipe, which automates the installation of the Sun Java JDK. First, let's check that Java is not installed:

# java -version
-bash: java: command not found

Now, we can run chef-solo:

# chef-solo -j /tmp/node.json -r http://public.sophia.grid5000.fr/~priteau/cookbooks.tgz

Note that we pass our node.json file with the -j option, and with -r we pass a URL to a remote gzipped tarball of recipes that will be extracted to the cookbook cache (remember the cookbook_path in /etc/chef/solo.rb? That's were the recipes go.). In this case, the cookbooks.tgz file contains a java_sun cookbook. You should see an output similar to this one:

[Mon, 18 Apr 2011 08:43:44 +0200] INFO: Setting the run_list to ["recipe[java_sun]"] from JSON
[Mon, 18 Apr 2011 08:43:44 +0200] INFO: Starting Chef Run (Version 0.9.16)
[Mon, 18 Apr 2011 08:43:44 +0200] INFO: Installing package[sun-java6-jdk] version 6.24-1~squeeze1
[Mon, 18 Apr 2011 08:43:44 +0200] INFO: Pre-seeding package[sun-java6-jdk] with package installation instructions.
[Mon, 18 Apr 2011 08:44:02 +0200] INFO: package[sun-java6-jdk] sending run action to execute[update-java-alternatives] (immediate)
[Mon, 18 Apr 2011 08:44:02 +0200] INFO: Ran execute[update-java-alternatives] successfully
[Mon, 18 Apr 2011 08:44:02 +0200] INFO: Chef Run complete in 17.475364 seconds
[Mon, 18 Apr 2011 08:44:02 +0200] INFO: cleaning the checksum cache
[Mon, 18 Apr 2011 08:44:02 +0200] INFO: Running report handlers
[Mon, 18 Apr 2011 08:44:02 +0200] INFO: Report handlers complete

We can see that Chef installs the sun-java6-jdk Debian package, but also runs update-java-alternatives to set this java version as the default. Check that Java has been successfully installed:

# java -version
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)

Now, run the same Chef command again:

# chef-solo -j /tmp/node.json -r http://public.sophia.grid5000.fr/~priteau/cookbooks.tgz

You should see something like this:

[Mon, 18 Apr 2011 09:18:45 +0200] INFO: Setting the run_list to ["recipe[java_sun]"] from JSON
[Mon, 18 Apr 2011 09:18:45 +0200] INFO: Starting Chef Run (Version 0.9.16)
[Mon, 18 Apr 2011 09:18:45 +0200] INFO: Chef Run complete in 0.055622 seconds
[Mon, 18 Apr 2011 09:18:45 +0200] INFO: cleaning the checksum cache
[Mon, 18 Apr 2011 09:18:45 +0200] INFO: Running report handlers
[Mon, 18 Apr 2011 09:18:45 +0200] INFO: Report handlers complete

We notice that the execution is much faster, and there is no output about the Sun Java JDK. This is because Chef detects that the package has already been installed, and doesn't perform any action.

Writing Chef cookbooks

It is now time to start writing our own Chef cookbooks! Go on the frontend of the site where you did your reservation, and create an empty repository of Chef cookbooks:

$ gem install rake
$ export PATH=$HOME/.gem/ruby/1.8/bin:$PATH
$ env http_proxy=http://proxy:3128 git clone http://github.com/opscode/chef-repo.git
$ cd chef-repo
$ rake new_cookbook COOKBOOK=hadoop

Let's study what has been generated by this step:

$ cd cookbooks/hadoop
$ ls -1F
README.rdoc
attributes/
definitions/
files/
libraries/
metadata.rb
providers/
recipes/
resources/
templates/
  • README.rdoc: a description of this cookbook
  • attributes: attributes are confituration data that can be used by recipes
  • definitions: allows to create new Chef resources
  • files: any file can be stored here and used by recipes
  • libraries: include arbitrary Ruby code
  • metadata.rb: metadata for the cookbook, describing its dependencies and supported platforms
  • providers: to support multiple platforms with a single resource
  • recipes: the code doing all system changes is stored here
  • resources: abstraction of a configuration item
  • templates: ERB templates used by recipes

Before starting to write the recipe for our Hadoop cookbook, we have to include its dependencies. Hadoop requires Java, so we are going to include in our repository the java_sun cookbook we used earlier.

$ cd ~/chef-repo/cookbooks
$ cp -R /home/priteau/cookbooks/java_sun .
$ cd hadoop

Edit metadata.rb to add the following line:

depends          "java_sun"

This tells Chef that the hadoop cookbook depends on the java_sun cookbook.

Now, we can start writing our recipe. Edit the recipes/default.rb file and add the following line.

include_recipe "java_sun"

This tells the hadoop default recipe to execute the java_sun default recipe (and thus install the Sun Java JDK if it is not installed).

Next, we describe a command execution:

execute "Install Hadoop" do
  command <<-EOH
  cd /tmp
  wget http://public.sophia.grid5000.fr/~priteau/hadoop-0.20.2.tar.gz
  tar xzf hadoop-0.20.2.tar.gz
  rm -rf /opt/hadoop
  useradd -d /opt/hadoop hadoop
  mkdir -p /opt/hadoop
  mv hadoop-0.20.2/* /opt/hadoop
  chown -R hadoop:hadoop /opt/hadoop
  sed -i 's/# export JAVA_HOME=/usr/lib/j2sdk1.5-sun/export JAVA_HOME=/usr/lib/jvm/java-6-sun/' /opt/hadoop/etc/hadoop-env.sh
  EOH
end

As you can see, we can create an execute block and provide it with a shell script which is run as root by default. However, this doesn't bring much advantage compared to a regular shell script. Now, we are going to see how we can use resources provided by Chef to make this recipe cleaner and easier to read.

User and group creation

The following creates a hadoop Unix group, and a hadoop Unix user belonging to this group and using /opt/hadoop as home directory.

group "hadoop" do
end

user "hadoop" do
  group "hadoop"
  home "/opt/hadoop"
  shell "/bin/bash"
end

Directory creation

With the following code, we create the home directory of our user. The first block recursively removes the /opt/hadoop directory. The second one creates it and sets the ownership to hadoop:hadoop.

# We clean up the install directory to start from scratch
directory "/opt/hadoop" do
  action :delete
  recursive true
end

# The default directory action is create, so this block creates a /opt/hadoop directory owned by hadoop:hadoop
directory "/opt/hadoop" do
  owner "hadoop"
  group "hadoop"
end

File fetching

The following block fetches the hadoop-0.20.2.tar.gz file from the public HTTP server in Sophia and sets its ownership to hadoop:hadoop. The file is stored in the file cache of Chef (/tmp/chef-solo, configured in the /etc/chef/solo.rb file). The checksum (a SHA-256 hash) allows Chef to avoid downloading the file if it is already present on the node.

remote_file "#{Chef::Config[:file_cache_path]}/hadoop-0.20.2.tar.gz" do
  checksum "94a08444706bb09a4f1bd124e5533fbb483e30f764ce647eb0adc399c7b9b174"
  owner "hadoop"
  group "hadoop"
  source "http://public.sophia.grid5000.fr/~priteau/hadoop-0.20.2.tar.gz"
end

Command execution

We can now modify our execute block to only do the untar and move the files:

execute "Install Hadoop" do
  cwd "/tmp"
  user "hadoop"
  group "hadoop"
  command <<-EOH
  tar xzf #{Chef::Config[:file_cache_path]}/hadoop-0.20.2.tar.gz
  mv hadoop-0.20.2/* /opt/hadoop
  EOH
end

Templates

With Chef, it is possible to generate files from templates written in ERB. ERB allows to embed Ruby code in a template that will be run to generate the output file. From within the Chef recipe, we generate a file from a template using a block like this:

template "/opt/hadoop/conf/hadoop-env.sh" do
  source "hadoop-env.sh"
  mode 0644
  owner "hadoop"
  group "hadoop"
  variables(
    :java_home => "/usr/lib/jvm/java-6-sun/"
  )
end

/opt/hadoop/conf/hadoop-env.sh is the path to the output file. hadoop-env.sh is the source template and is stored in templates/default in the cookbook directory. variables is a Ruby hash accessible from the ERB code included in the template.

To create the template file, we start from the hadoop-env.sh file included in the Hadoop source. For convenience, it is available on the Grid'5000 NFS server:

cp /home/priteau/hadoop-0.20.2/conf/hadoop-env.sh templates/default/

Edit templates/default/hadoop-env.sh and modify the line:

# export JAVA_HOME=/usr/lib/j2sdk1.5-sun

by the line:

export JAVA_HOME=<%= @java_home %>

The Ruby code between <%= and %> markers is evaluated and its value (here, passed from the Chef hadoop recipe) replaces the block.

We also need a template for the core-site.xml and mapred-site.xml files:

template "/opt/hadoop/conf/core-site.xml" do
  source "core-site.xml"
  mode 0644
  owner "hadoop"
  group "hadoop"
  variables(
    :namenode => node[:hadoop][:namenode]
  )                            
end                            

template "/opt/hadoop/conf/mapred-site.xml" do
  source "mapred-site.xml"
  mode 0644
  owner "hadoop"
  group "hadoop"
  variables(
    :jobtracker => node[:hadoop][:jobtracker]
  )
end

templates/default/core-site.xml will configure the name node (the metadata server):

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

<property>
  <name>fs.default.name</name>
  <value>hdfs://<%= @namenode %>:54310</value>
</property>

</configuration>

templates/default/mapred-site.xml will configure the job tracker:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

<property>
  <name>mapred.job.tracker</name>
  <value>hdfs://<%= @jobtracker %>:54311</value>
</property>

</configuration>

Final common configuration steps

These final common steps perform a cleanup of HDFS and kill Hadoop services if they are running.

# We clean up the HDFS directory
directory "/tmp/hadoop-hadoop" do
  action :delete
  recursive true
end

execute "Format HDFS" do
  user "hadoop"
  group "hadoop"
  command <<-EOH
  /opt/hadoop/bin/hadoop namenode -format
  EOH
end

execute "Stop Hadoop" do
  user "hadoop"
  group "hadoop"
  command "pkill -9 java; true"
end

Launching services

Finally we can run the Hadoop services. Since we don't run the same services on all nodes, we are going to differentiate between master (NameNode and JobTracker) and slave nodes (DataNode and TaskTracker).

We create a new recipe, recipes/master.rb:

include_recipe "hadoop"

service "NameNode" do
  start_command "su hadoop sh -c '/opt/hadoop/bin/hadoop-daemon.sh start namenode'"
  supports [ :start ]
  action [ :start ]
end

service "DataNode" do
  start_command "su hadoop sh -c '/opt/hadoop/bin/hadoop-daemon.sh start datanode'"
  supports [ :start ]
  action [ :start ]
end

service "JobTracker" do
  start_command "su hadoop sh -c '/opt/hadoop/bin/hadoop-daemon.sh start jobtracker'"
  supports [ :start ]
  action [ :start ]
end

service "TaskTracker" do
  start_command "su hadoop sh -c '/opt/hadoop/bin/hadoop-daemon.sh start tasktracker'"
  supports [ :start ]
  action [ :start ]
end

We create a recipe for the slaves, recipes/slave.rb:


include_recipe "hadoop"

service "DataNode" do
  start_command "su hadoop sh -c '/opt/hadoop/bin/hadoop-daemon.sh start datanode'"
  supports [ :start ]
  action [ :start ]
end

service "TaskTracker" do
  start_command "su hadoop sh -c '/opt/hadoop/bin/hadoop-daemon.sh start tasktracker'"
  supports [ :start ]
  action [ :start ]
end

Cookbook execution

To execute our Hadoop cookbook on the first node, first create its JSON description, let's store it in ~/master.json on the Grid'5000 frontend:

{
  "hadoop": {
    "namenode": "first-node.site.grid5000.fr",
    "jobtracker": "first-node.site.grid5000.fr"
  },
  "run_list": [ "recipe[hadoop::master]" ]
}

Create an archive of the cookbooks:

$ tar -C ~/chef-repo -cvzf ~/cookbooks.tgz ./cookbooks

On the first node, as root, run:

# chef-solo -j /home/USER/master.json -r /home/USER/cookbooks.tgz

Creating a slave.json file to configure the slave nodes is left as an exercise for the reader.

Resources

A Can of Condensed Chef Documentation

Personal tools
Wiki special pages