MPI Tutorial

From Grid5000
Jump to: navigation, search

Contents

Connect to Grid'5000

  • you should connect to the frontend node of the cluster you were granted access to:
Terminal.png outside:
ssh login@access.site.grid5000.fr
  • if needed, you may then internally jump onto the frontend of another cluster following your deployment purposes:
Terminal.png frontend:
ssh another_site.grid5000.fr

What you need to know before starting

You need a to know how to program in C

You need a cluster with MPI. To check reserve some nodes with OAR (using option -t allow_classic_ssh) and type:

Terminal.png node:
mpicc -v
Terminal.png node:
mpirun -V

Start tutorial

Goto to: https://computing.llnl.gov/tutorials/mpi/

Exercice 1

  • Read up to the first example.
  • Code this example: example.c
  • Compile it:
Terminal.png node:
mpicc example.c -o example
  • Create a file with all the nodes/cores of your reservation (one IP/machine name per line). For example:

grelon-5.nancy.grid5000.fr

grelon-22.nancy.grid5000.fr

grelon-110.nancy.grid5000.fr

grelon-59.nancy.grid5000.fr

You can do this using $OAR_NODEFILE environment variable.

Terminal.png node:
cat $OAR_NODEFILE > nodefile

or,

Terminal.png node:
cat $OAR_NODEFILE | uniq > nodefile
  • Launch the program. For OpenMPI:
Terminal.png node:
mpirun -np 4 -machinefile nodelist example

This launches the example executable on the 4 the first machine on the file nodelist

  • With different execution, you should see that the order of the output is different.

Exercice 2: token circulation

  • Read up to the Blocking Message Passing Routines, And run this example.
  • Make a token passes through all MPI processes in order.
  • Process 0 initialize the algorithm and passes the token to process 1. To do that it sends a message to process 1.
  • The other processes wait for the token. To do that, they wait for a message from process rank-1. Then they send the token to process rank+1. Except the last process that sends it to process 0.
  • Use MPI_Send and MPI_Recv to send and receive messages.
  • Solution here:
#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>

int main(int argc,char *argv[]){
  int token=0;
  int numtasks,rank,rc;
  MPI_Status stat;

  rc = MPI_Init(&argc,&argv);
  if (rc != MPI_SUCCESS) {
    printf ("Error starting MPI program. Terminating.\n");
    MPI_Abort(MPI_COMM_WORLD, rc);
  }


  MPI_Comm_size(MPI_COMM_WORLD,&numtasks);
  MPI_Comm_rank(MPI_COMM_WORLD,&rank);


  if(rank==0){ 
    MPI_Send(&token,1,MPI_INT,1,1,MPI_COMM_WORLD);
    printf("[%d]: %d\n",rank,token);
    MPI_Recv(&token,1,MPI_INT,numtasks-1,1,MPI_COMM_WORLD,&stat);
  }else{
    MPI_Recv(&token,1,MPI_INT,rank-1,1,MPI_COMM_WORLD,&stat);
    token++;
    printf("[%d]: %d\n",rank,token);
    MPI_Send(&token,1,MPI_INT,(rank+1)%numtasks,1,MPI_COMM_WORLD);
  }
  
  MPI_Finalize();
}

Exercice 3: messages exchange

  • Allocate two buffers r_buffer and s_buffer of n int. n being passed through argv.
  • Each process post a non-blocking receive (MPI_Irecv) for r_buffer from numtasks-myrank-1.
  • Each process send s_buffer to process numtasks-myrank-1 and waits (MPI_Wait) to receive it own message.
  • To pass arguments to your program, just put then at the end of the mpirun line:
Terminal.png node:
mpirun -np 10 -machinefile nodelist example3 1000
  • Solution:
#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>

int main(int argc,char *argv[]){
  int  numtasks, rank, rc; 
  int *r_buffer,*s_buffer,size;
  MPI_Request req;
  MPI_Status stats;
  

  if(argc!=2){
    printf("usage: %s <nb int>\n",argv[0]);
    exit(-1);
  }

  size=atoi(argv[1]);

  rc = MPI_Init(&argc,&argv);
  if (rc != MPI_SUCCESS) {
    printf ("Error starting MPI program. Terminating.\n");
    MPI_Abort(MPI_COMM_WORLD, rc);
  }
       
  MPI_Comm_size(MPI_COMM_WORLD,&numtasks);
  MPI_Comm_rank(MPI_COMM_WORLD,&rank);
  
  s_buffer=(int*)malloc(sizeof(int)*size);
  r_buffer=(int*)malloc(sizeof(int)*size);

  MPI_Irecv(r_buffer,size,MPI_INT,numtasks-rank-1,1,MPI_COMM_WORLD,&req);

  start=MPI_Wtime();
  MPI_Send(s_buffer,size,MPI_INT,numtasks-rank-1,1,MPI_COMM_WORLD);
  MPI_Wait(&req,&stats);
  printf("[%d]: done!\n");
  MPI_Finalize();
}
  • Read the part concerning the collective communication and especially MPI_Barrier and MPI_Reduce
  • Put a barrier between the MPI_Irecv and the MPI_Send. What does this do exactly?
  • Time with MPI_Wtime what happens after the barrier up to the end of the MPI_Wait.
  • Solution:
#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>

int main(int argc,char *argv[]){
  int  numtasks, rank, rc; 
  int *r_buffer,*s_buffer,size;
  MPI_Request req;
  MPI_Status stats;
  double start,time,max_time,min_time,sum_time;

  if(argc!=2){
    printf("usage: %s <nb int>\n",argv[0]);
    exit(-1);
  }

  size=atoi(argv[1]);

  rc = MPI_Init(&argc,&argv);
  if (rc != MPI_SUCCESS) {
    printf ("Error starting MPI program. Terminating.\n");
    MPI_Abort(MPI_COMM_WORLD, rc);
  }
       
  MPI_Comm_size(MPI_COMM_WORLD,&numtasks);
  MPI_Comm_rank(MPI_COMM_WORLD,&rank);
  
  s_buffer=(int*)malloc(sizeof(int)*size);
  r_buffer=(int*)malloc(sizeof(int)*size);


  MPI_Irecv(r_buffer,size,MPI_INT,numtasks-rank-1,1,MPI_COMM_WORLD,&req);

  MPI_Barrier(MPI_COMM_WORLD);
  start=MPI_Wtime();
  MPI_Send(s_buffer,size,MPI_INT,numtasks-rank-1,1,MPI_COMM_WORLD);
  MPI_Wait(&req,&stats);
  time=MPI_Wtime()-start;
  printf("[%d]: duration=%f\n",time);
  MPI_Finalize();
}
  • Use MPI_Reduce to gather, on process 0, the maximum time, the minimum time and the average time of all the processes timings. Have process 0 displaying these statistics.
  • Solution:
#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>

int main(int argc,char *argv[]){
  int  numtasks, rank, rc; 
  int *r_buffer,*s_buffer,size;
  MPI_Request req;
  MPI_Status stats;
  double start,time,max_time,min_time,sum_time;

  if(argc!=2){
    printf("usage: %s <nb int>\n",argv[0]);
    exit(-1);
  }

  size=atoi(argv[1]);

  rc = MPI_Init(&argc,&argv);
  if (rc != MPI_SUCCESS) {
    printf ("Error starting MPI program. Terminating.\n");
    MPI_Abort(MPI_COMM_WORLD, rc);
  }
       
  MPI_Comm_size(MPI_COMM_WORLD,&numtasks);
  MPI_Comm_rank(MPI_COMM_WORLD,&rank);
  
  s_buffer=(int*)malloc(sizeof(int)*size);
  r_buffer=(int*)malloc(sizeof(int)*size);


  MPI_Irecv(r_buffer,size,MPI_INT,numtasks-rank-1,1,MPI_COMM_WORLD,&req);

  MPI_Barrier(MPI_COMM_WORLD);
  start=MPI_Wtime();
  MPI_Send(s_buffer,size,MPI_INT,numtasks-rank-1,1,MPI_COMM_WORLD);
  MPI_Wait(&req,&stats);
  time=MPI_Wtime()-start;
    
  MPI_Reduce(&time, &max_time, 1, MPI_DOUBLE, MPI_MAX, 0,MPI_COMM_WORLD);
  MPI_Reduce(&time, &min_time, 1, MPI_DOUBLE, MPI_MIN, 0,MPI_COMM_WORLD);
  MPI_Reduce(&time, &sum_time, 1, MPI_DOUBLE, MPI_SUM, 0,MPI_COMM_WORLD);
  if(rank==0){
    printf("Max=%f\tMin=%f\tAvg=%f\n",max_time,min_time,sum_time/numtasks);
  }
}

Exercice 4: distributed max

  • On process 0. Create a a random array of n'*p int. p: number of processes, n: passed through argv.
  • Have process 0 splitting this array of p subarrays of size n and distribute each pieces to all the processes. This is done with one MPI_Scatter call. You have to allocate the sub array before hand.
  • Each process compute the maximum of its local array
  • Use MPI_Reduce to gather the maximum of the array on process 0.
  • Solution:
#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>

int *random_tab(int n){
  int *tab,i;
  tab = (int*)malloc(sizeof(int)*n);

  for(i = 0; i < n; i++)
    //    tab[i]=random();
    tab[i]=i;

  return tab;
}


int max_tab(int *tab,int n){
  int res,i;
  res=tab[0];
  for(i=1;i<n;i++)
    if(tab[i]>res)
      res=tab[i];
 
  return res;
}

int main(int argc,char *argv[]){
  int  numtasks, rank, rc;
  int local_max,max;
  int n,*tab,*local_tab;

  if(argc!=2){
    printf("usage: %s <nb int>\n",argv[0]);
    exit(-1);
  }

  n=atoi(argv[1]);

  rc = MPI_Init(&argc,&argv);
  if (rc != MPI_SUCCESS) {
    printf ("Error starting MPI program. Terminating.\n");
    MPI_Abort(MPI_COMM_WORLD, rc);
  }
       
  MPI_Comm_size(MPI_COMM_WORLD,&numtasks);
  MPI_Comm_rank(MPI_COMM_WORLD,&rank);

  if(rank==0)
    tab=random_tab(n*numtasks);
  
  local_tab=(int*)malloc(sizeof(int)*n);

  MPI_Scatter(tab,n,MPI_INT,local_tab,n,MPI_INT,0,MPI_COMM_WORLD);

  local_max=max_tab(local_tab,n);

  printf("local_max[%d]=%d\n",rank,local_max);
  /*Global reduce of the local_max*/
  MPI_Reduce(&local_max,&max,1,MPI_INT,MPI_MAX,0,MPI_COMM_WORLD);
  
  /* result!!! */
  if(rank==0)
    printf("         max=%d\n",max);

  MPI_Finalize();
}

Exercice 4: parallel sort

  • On process 0: Build an array of n random int
  • Sample the array and build p buckets such that all data in bucket i are smaller than all data in buckets i+1
  • Send bucket i to process i
  • Each process sort the bucket locally (with qsort of <stdlib.h>).
  • Gather each sorted buckets into one sorted buffer: one call of MPI_GatherV.
Personal tools
Namespaces

Variants
Actions
Public Portal
Users Portal
Admin portal
Wiki special pages
Toolbox