Syncing data

From Grid5000
Grid'5000 does not provide backup facility but each site has its own file server, independent from those of other sites. Thanks to this, a user can manage backup by synchronizing his data between the different sites.

On Grid'5000 a user possesses 9 home directories, one per site. It is his responsibility to synchronize data he wants to use all over the grid.

Sync a directory from site to site

To avoid confusing between what is synced and what is not synced, synced data will be put inside a directory named synced. To synchronize a directory, rsync is used:

rsync --dry-run --delete -avz ~/synced

To really do things, the --dry-run argument has to be removed and site has to be replaced by a real site name.

Note: if a password prompt appears during the execution, please read about SSH public key connection method before going any further.

By adding a loop, we can easily synchronize our favorite sites:

for site in bordeaux lyon toulouse; do
  rsync --delete -avz ~/synced ${site};

Note: be careful not to synchronize the source site with itself

Schedule synchronization every night

To periodically run the synchronization, cron is used. We have to edit our personal table used to drive cron:

crontab -e

And schedule the synchronization run every day at 4:00am:

00 4    *   *   *  for site in bordeaux lyon toulouse; do rsync --delete -avz ~/synced ${site}; done

Note: only one line is allowed per cron schedule definition

