CLCAR 2009 OAR Tutorial
This practice is about running jobs on a cluster. You will learn how to access a Grid'5000 cluster, how to install your data and how to run your jobs and visualize them.
On Grid'5000, you can only submit your jobs when you are inside a site. So you must connect to one of the sites of the grid.
Once you know where you want to run your job, the command to connect to the site is:
(do not forget to replace
login with your Grid'5000 login and
site by the chosen site, for example nancy)
Then you have to connect you to the frontend machine to make OAR reservations
Note that in some sites, access machine and frontend are the same server.
Prepare the experiment environment
An experiment is often composed of running codes and their configuration. The experiments of this practice are based on the HPLinpack benchmark. This benchmark quantifies the number of floating operations per second during the solving of a random dense linear system.
The programs to run and their configuration are available outside of Grid'5000 infrastructures. Thus, we need to retrieve those external data an install them on the cluster where we are currently connected.
Finally we prepare the execution environment of the experiment.
Setup data on the cluster
We retrieve our data on the frontend.
The experiment data are stored on a specific directory on grid5000 frontend clusters. Data retrieval can be performed on the frontal using the following command:
Once done, we unpack our experiment data:
A directory called <code class="dir">runhpl is now visible in our home directory. This directory will be available on every compute node of this site, thanks to a
NFS-mount of user home directories.
OAR2, the Grid'5000 batch scheduler, has an interactive mode. This mode connects the user to the first of his allocated nodes.
Submit an interactive job:
OAR2 returns a number. This number identify our submission on the various visualization interfaces:
You were automatically connected on the first node that is allocated to your job. OAR2 initialize several environment variables set to reflect current submission. These variables can be used by your script to adapt to current submission properties:
Especially the list of your dedicated nodes can be viewed:
Depending on the site where you made the submission, the same node name may appear more than one time inside the
It is time to run our script:
Results is going to be printed on the standard output. Remember that we quantifies the number of floating operation per second.
Submission is visible on the Monika web interface of the site where it was submitted:
Cluster status cannot be obtained in command-line from the node where you were connected by OAR. You need another terminal connected to the
oar machine of the site or cluster (do not forget to replace
OAR_JOB_ID by your current submission id):
With interactive submission, submission ending is not related to the end of your script run. It depends on the connection to the submission main node that OAR automatically made for you. Thus you can run as many scripts as you want before job deadline.
Your current submission uses a default walltime of 1 hours. If you are still connected to the submission nodes after 1 hours, OAR2 will automatically disconnect you.
By quitting the shell opened for you by OAR upon the
oarsub command (e.g. shell on the first node of your job), your will terminate the job. Thus, as we are finished with our job here, you can end it by typing type:
Web interfaces and command-line tools reflect submission ending:
Experiment data are ready, before running the job, we are now going to analyze cluster state, which can be visualized in many ways.
oarstat is a command-line tool to view current or planned job submission.
View each submissions:
View each submission details:
View a specific submission details
View each submissions from a given user: View the status of a specified job:
View each submissions from a given user:
oarnodes is also a command-line tool. It shows cluster node properties:
Among returned information there is current node state. This state is generally
Absent. When nodes are sick, their state is
oarprint is a tool providing a pretty print of a job resources.
The command prints a sorted output of the resources of a job with regard to a key property, with a customisable format.
On a job connection node (where $OAR_RESOURCE_PROPERTIES_FILE is defined):
On the submission frontend:
Monika is a web interface which in a way synthesizes information given by
oarnodes. Current node state is displayed and the current and planned submission list is printed at the bottom of the web page.
Starting from the site list web page, click on the site or the cluster where your are currently connected to view its current state:
Starting from the site list web page, click on the site or the cluster where your are currently connected to briefly manipulate the web interface options:
Node load, memory usage, cpu usage and so on are available with the Ganglia web interface.
Starting from this interface home page, click on the site or the cluster where your are currently connected to quickly analyze available metrics and diagrams:
By default Ganglia displays current metrics values but you can view an abstract of these values over the past year.
OAR2, Grid'5000 batch scheduler, has a passive mode. In this mode, a script specified at submission time is run on the main dedicated node. This script must know about dedicated nodes to split its work between them.
Submit a passive job with our script:
Environment variables, described during our interactive submission, are also initialized by OAR for passive jobs.
Submission is visible on the Monika web interface of this site where it was submitted:
You can also view current cluster status with command-line tool (do not forget to replace
OAR_JOB_ID by your current submission id):
Sometime it is useful to know when a passive job starts. It is possible to do that by using the "--state" or "-s" option of
oarstat. This option is optimized to get the status of a specified job faster than with "-f" option. It is especially designed for a scripting usage.
The following script launchs in a passive job the runhpl program and waits until the job starts.
#!/bin/bash my_script="~/runhpl/runhpl" oar_job_id=`oarsub $my_script | grep "OAR_JOB_ID" | cut -d '=' -f2` oar_stdout_file="OAR.$oar_job_id.stdout" until oarstat -s -j $oar_job_id | grep Running ; do echo "Waiting for the job to start..." sleep 1 done echo "Job $oar_job_id is started !"
To analyze standard output
stdout and standard error output
stderr of the main node, where our script is run, OAR puts theses output in files named
OAR.OAR_JOB_ID.stdname. Depending on your submission id, you should see in your home directory:
Thus you can follow your job's run in live:
To quit job monitoring, please press
Ctrl + c.
You can specify the files that will store the standard output and error stream of the job by using the -O and -E option in the oarsub command. For example it's possible to redirect the oarsub output in /dev/null or in other file
OAR submission automatically ends when script ends. You can see this on the Monika web interface of the site where the script was submitted. Job's dedicated nodes returned in their available state:
Submission left its print on the DrawGantt web interface of its host site:
Submission has also gone from command-line status:
Our job's results are available at the end of standard output file:
Node number specification
Until now our submissions used the default node number: 1 node.
We are going now to submit an interactive job on 2 nodes:
We are automatically connected to one of these 2 nodes due to interactive submission. We can learn about our dedicated resources:
As you can read, the benchmark script detects available node number and can adapt to 1, 2 and 4 nodes:
We can verify that the run occurs on the other node with another terminal (do not forget to replace
OAR_JOB_ID by the current job identifier and
hostname by the allocated node name):
Until now our submissions used the default start time and the default duration: immediate start and 1 hour duration. We are going to submit jobs with a specific duration and a delayed start.
Let us run the script on December 7th, 2008 at 3:30pm for a 10-minute duration:
You can submit your job without specifying script when you do delayed submission. In this case, allocated nodes are waiting for job's owner activity.
The delayed submission appears as Scheduled on the Monika web interface of the site where it was submitted or on the command-line (do not forget to replace
OAR_JOB_ID by your submission id):
When it is time for submission to begin, you can connect to its main node to interactively run the script or monitor its run:
Submission ending does not occur when you disconnect from its main node, even if you omit specifying a script to run. Ending occurs when specified script end or when duration deadline expire. If you do not specify a script to run and you finished, it is a good idea to release allocated nodes earlier:
Pay attention to the fact that your job will now be in the Error state, in this case.
Please refer to the following definitions to better understand the different OAR mecanisms: