PDSH
From Grid5000
PDSH (Parallel Distributed SHell) issues commands to groups of hosts in parallel. It is a rewrite of IBM's dsh implementation done at LLNL.
Contents |
Installing PDSH
Debian-way
PDSH is part of the official Debian package repository
Red Hat-way
RPM packages for PDSH are available on its SourceForge project page
Other ways
Tarballs are available on PDSH SourceForge project page
Using PDSH
PDSH: issuing commands
Basics
To perform a shell command:
pdsh[options]...command
To kill an instance of pdsh, two SIGINT, within one second, have to be sent:
pdsh@frontale: interrupt (one more within 1 sec to abort) pdsh@frontale: node-1: command in progress pdsh@frontale: node-2: command in progress sending SIGTERM to ssh node-1 pid 8298 sending SIGTERM to ssh node-2 pid 8299 pdsh@frontale: interrupt, aborting.
With no specified command, it runs interactively:
pdsh[options]... pdsh>command
RCMD Modules
pdsh knowns various methods to run commands on remote hosts. These methods are implemented via dynamically loadable modules, called RCMD modules.
- Get available RCMD modules list:
pdsh -L
- Run command via
SSH:
pdsh-R ssh [options]...command
- Run command via
RSH:
pdsh-R rsh [options]...command
- Setup
SSHas the default module (this can be done in/etc/environmentto apply to everybody):
PDSH_RCMD_TYPE=ssh
Targeting nodes
Include
With the -w argument, a list of nodes can be specified to pdsh. This argument accepts a comma-separated list:
pdsh-w node-1,node-3command
The node list may contain hostlist expressions of the form 'node-[1-5,7]'.
Exclude
With the -x argument, a list of nodes can be excluded from the included list. As for -w, -x argument accepts a comma-separated list:
pdsh-w node-[1-5,7] -x node-2,node-4command
The nodelist may also contain hostlist expressions of the form 'node-[2-3,5]'
Hostlist expressions
As a convenience on clusters with a prefixXXX naming convention, pdsh accepts lists of hosts the general form: prefix[a-b,c-d,e,...]. This is only an alternative to explicit lists of hosts. Some examples of usage follow:
- Run command on
node-1,node-2,...,node-5
pdsh-w node-[1-5]command
- Run command on
node-7,node-9,...,node-12
pdsh-w node-[7,9-10]command
- Run command on
node-1,node-2,node-6,...,node-9
pdsh-w node-[1-9] -x node-[3-5]command
| Warning | |
|---|---|
Some shells, like | |
Machines file
pdsh can use a machine file to apply command to a specific list of nodes:
pdsh-a -F$OAR_NODEFILEcommand
| Warning | |
|---|---|
PDSH versions prior to 2.12 cannot handle FQDN hostname: they are shortened to their canonical part. These PDSH versions cannot launch command accross the grid. | |
If no machines file is specified, pdsh uses the default /etc/genders:
pdsh-acommand
DSHBAK: formating output
dshbak formats output from pdsh command. It prints a header, wich contains node's name, before each node output. dshbak can compress identical output on demand. This way, headers contain hostlist expression of the nodes that match the output.
- Format output of the root filesystem location:
$pdsh-w node-2[1-4] "mount | grep 'on / '" |dshbak---------------- node-24 ---------------- /dev/sda7 on / type ext3 (rw,errors=remount-ro) ---------------- node-22 ---------------- /dev/sda2 on / type ext3 (rw,errors=remount-ro) ---------------- node-23 ---------------- /dev/sda2 on / type ext3 (rw,errors=remount-ro) ---------------- node-21 ---------------- /dev/sda2 on / type ext3 (rw,errors=remount-ro)
- Compress output of the root filesystem location:
$pdsh-w node-2[1-4] "mount | grep 'on / '" |dshbak-c ---------------- node-24 ---------------- /dev/sda7 on / type ext3 (rw,errors=remount-ro) ---------------- node-[21-23] ---------------- /dev/sda2 on / type ext3 (rw,errors=remount-ro)
PDCP: copying files
Basics
To perform file copy:
pdcp[options]...src[src2...]dest
As for pdsh, to kill an instance of pdcp, two SIGINT, within one second, have to be sent:
pdcp@frontale: interrupt (one more within 1 sec to abort) pdcp@frontale: node-1: command in progress pdcp@frontale: node-2: command in progress sending SIGTERM to ssh node-1 pid 24510 sending SIGTERM to ssh node-2 pid 24511 pdcp@frontale: interrupt, aborting.
RCMD Modules
As pdsh does, pdcp can choose a specific method to connect to remote hosts.
- Copy files via
SSH:
pdcp-R ssh [options]...src[src2...]dest
- Copy files via
RSH:
pdcp-R rsh [options]...src[src2...]dest
Targeting nodes
Same inclusion, exclusion and hostlist expressions mechanisms as pdsh ones are available for pdcp to target nodes.
Limitations
- Source files must be on the local host (eg. where
pdcpruns). - Each destination node listed must have
pdcpinstalled for the copy to succeed.
pdcp copies files to multiple remote hosts in parallel. These files must be on the local host. Each destination node listed must have pdcp installed for the copy to succeed.
