Run MPI On Grid'5000: Difference between revisions
Nniclausse (talk | contribs) |
Nniclausse (talk | contribs) |
||
Line 33: | Line 33: | ||
==Setting up and starting OpenMPI on a default environment using allow_classic_ssh== | ==Setting up and starting OpenMPI on a default environment using allow_classic_ssh== | ||
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -I -t allow_classic_ssh -l nodes=3}} | |||
* We will use a vary basic MPI program to test OAR/MPI; create a file <code class="file">$HOME/src/mpi/tp.c</code> and copy the following source: | * We will use a vary basic MPI program to test OAR/MPI; create a file <code class="file">$HOME/src/mpi/tp.c</code> and copy the following source: | ||
{{Term|location=node|cmd=<code class="command">mkdir</code> -p $HOME/src/mpi}} | |||
{{Term|location=node|cmd=<code class="command">vi</code> $HOME/src/mpi/tp.c}} | |||
now the sources: | now the sources: | ||
Line 68: | Line 68: | ||
* Compile your code | * Compile your code | ||
{{Term|location=node|cmd=<code class="command">$HOME/openmpi/bin/mpicc</code> src/mpi/tp.c -o src/mpi/tp}} | |||
* Use this script to launch | * Use this script to launch | ||
{{Term|location=node|cmd=<code class="command">$HOME/openmpi/bin/mpirun</code> -machinefile $OAR_NODEFILE $HOME/src/mpi/tp}} | |||
== Setting up and starting OpenMPI on a default environment using <code class=command>oarsh</code> == | == Setting up and starting OpenMPI on a default environment using <code class=command>oarsh</code> == |
Revision as of 14:06, 3 February 2010
Running MPI on Grid'5000
When attempting to run MPI on Grid'5000 you'll be faced with a number of challenges, ranging from classical setup problems for MPI software to problems specific to Grid'5000. This practical session aims at driving you through the most common uses cases, which are
- setting up and starting openMPI on a default environment using allow_classic_ssh
- setting up and starting openMPI on a default environment using oarsh
- setting up and starting openMPI on a kadeploy image
- setting up and starting openMPI to use high performance interconnect
Pre-requisite
- Basic knowledge of MPI; if you don't know MPI, you can read: Grid_computation
- Get OpenMPI here : http://www.open-mpi.org/software/ompi/v1.4/
Overwiew
Currently, the default environment is not the same on every sites, therefore, you don't have the same version of OpenMPI on every site. If you want to use OpenMPI for a grid experiment, you will have to install your own MPI version; You have two options:
- install OpenMPI on your home dir (but you should recompile it and install it on all the sites you want to use! it may work by simply copying the compiled file, but it's not guaranteed)
- use the same kadeploy image and deploy it on the the sites you want to use
If you are only interested on a single site experiment, you may use the version provided by the default environment.
= Using OpenMPI on a default environment
Compilation
- Make a reservation (or connect to the compilation machine if available (compil.<node>.site.grid5000.fr) and compile it :
mkdir -p $HOME/src/mpi oarsub -I ./configure --prefix=$HOME/openmpi/ --with-memory-manager=none make -j4
- Install it on your home directory (in $HOME/openmpi/ )
make install
Setting up and starting OpenMPI on a default environment using allow_classic_ssh
- We will use a vary basic MPI program to test OAR/MPI; create a file
$HOME/src/mpi/tp.c
and copy the following source:
now the sources:
#include <stdio.h> #include <mpi.h> #include <time.h> /* for the work function only */ int main (int argc, char *argv []) { char hostname[257]; int size, rank; int i, pid; int bcast_value = 1; gethostname (hostname, sizeof hostname); MPI_Init (&argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); if (!rank) { bcast_value = 42; } MPI_Bcast (&bcast_value,1 ,MPI_INT, 0, MPI_COMM_WORLD ); printf("%s\t- %d - %d - %d\n", hostname, rank, size, bcast_value); fflush(stdout); MPI_Barrier (MPI_COMM_WORLD); MPI_Finalize (); return 0; }
- Compile your code
- Use this script to launch
Setting up and starting OpenMPI on a default environment using oarsh
- oarsub -I -l nodes=3
oarsh
is the default connector used when you reserve a node. To be able to use this connector, you need to add the option --mca plm_rsh_agent "oarsh"
to mpirun.
$HOME/openmpi/bin/mpirun --mca plm_rsh_agent "oarsh" -machinefile $OAR_NODEFILE $HOME/src/mpi/tp
multi sites
Is this practical sessions, we will do multiples sites MPI with kadeploy. If you want to do this with the default environment, the following steps are required:
- recompile openmpi on all the sites you want to use, in the same directory ($HOME/openmpi)
- recompile your mpi application on all the sites using your openmpi.
- use oargridsub to reserve nodes on several sites
- build a node file using
oargridstat -l
- launch mpirun from the first node of your nodefile, using this nodefile instead of $OAR_NODEFILE.
Setting up and starting OpenMPI on a kadeploy image
Building a kadeploy image
![]() |
Note |
---|---|
You can skip this section and use directly the environment lenny-x64-openmpi available at sophia |
The default openmpi version available in debian based distributions are not compiled with high performances libraries like myrinet/MX, therefore we must recompile OpenMPI from sources. Fortunately, the default images (lenny-x64-XXX) includes all the libraries for high performance interconnect, and OpenMPI will find them at compile time.
We will create a kadeploy image based on an existing one.
Then connect on the deployed node as root, and install openmpi:
Unarchive openmpi
Configure and compile
Add blas library
Create the image using tgz-g5k
Copy the image on the frontend:
Copy the description file of lenny-x64-nfs
Change the image name in the description file:
![]() |
frontend :
|
perl -i -pe "s@/grid5000/images/lenny-x64-nfs-2.0.tgz@$HOME/lenny-openmpi.tgz@" $HOME/lenny-openmpi.dsc |
Using a kadeploy image
single site
multiple sites
Choose three clusters from 3 different sites.
![]() |
frontend :
|
oargridsub -t deploy cluster1 :rdef="nodes=2",cluster2 :rdef="nodes=2",cluster3 :rdef="nodes=2" |
Setting up and starting OpenMPI to use high performance interconnect
By default, openMPI tries to use any high performance interconnect he can find. This true only if he has found the libraries at compile time (compilation of openmpi, not your application). This should be true if you have built OpenMPI on a lenny-x64 environment.
We will using the Netpipe tool to check if the high performance interconnect is really used: download it from this URL: http://www.scl.ameslab.gov/netpipe/code/NetPIPE-3.7.1.tar.gz
cd $HOME/src/mpi tar zvxf ~/dload/NetPIPE-3.7.1.tar.gz cd NetPIPE-3.7.1 export PATH=~/openmpi/1.4.1/bin:$PATH make mpi
Myrinet hardware : MPICH-MX
To reserve one core on two nodes with a myrinet interconnect:
- oarsub -I -l /nodes=2/core=1 -p "myri2g='YES'"
or
- oarsub -I -l /nodes=2/core=1 -p "myri10g='YES'"
cd $HOME/src/mpi/NetPIPE-3.7.1 $HOME/openmpi/bin/mpirun --mca plm_rsh_agent "oarsh" -machinefile $OAR_NODEFILE NPmpi
you should have something like that:
0: 1 bytes 4080 times --> 0.31 Mbps in 24.40 usec 1: 2 bytes 4097 times --> 0.63 Mbps in 24.36 usec ... 122: 8388608 bytes 3 times --> 896.14 Mbps in 71417.13 usec 123: 8388611 bytes 3 times --> 896.17 Mbps in 71414.83 usec
the minimum latency is given by the last column for a 1 byte message the maximum throughput is given by the last line, 896.17 Mbps in this case
To reserve one core on two nodes with an infiniband interconnect:
Infiniband hardware : MVAPICH
- oarsub -I -l /nodes=2/core=1 -p "ib10g='YES'"
or
- oarsub -I -l /nodes=2/core=1 -p "ib20g='YES'"
Setting up openMPI to accept private networks as routable between hosts
OpenMpi site : OpenMPI
- Connect on one site and reserve a node
ssh nancy.grid5000.fr oarsub -I
- Connect to another site and reserve a node
ssh rennes.grid5000.fr oarsub -I
- Try to launch the code that you previously create between the two reserved nodes