Storage5k

From Grid5000
Revision as of 11:49, 5 April 2018 by Pneyron (talk | contribs)
Jump to navigation Jump to search

Storage5k is a tool for the reservation of large storage, providing much more space than the one available in a user's /home. Contrary to other resource reservations in Grid'5000, the duration of a storage5k reservation is not limited, allowing data to be used in long term experiments.

The storage reserved by storage5k is accessible on Grid'5000 nodes, using NFS mounts. Some mechanisms are provided to make this storage directly available on your nodes reserved on the same site as the storage, or to mount it in deployed nodes.

Warning.png Warning

Storage5k provides long term storage for experimentations, but not data integrity is guaranteed. Stored data may get lost due to incident or any other reason (like an unexpected end of reservation). You MUST backup important data.

Warning.png Warning

Storage5k uses to export volumes though the iSCSI protocol, but this functionality is not available anymore. Only NFS is provided

Storage5k resources pool

No all sites provides storage5k volumes. The following tables give the volumes per site:

Sites Server Name Size Status
Sophia stock.sophia.grid5000.fr 2TB Check.png
Rennes srv-bigdata.rennes.grid5000.fr 6TB Check.png

Usage

The Storage5k tool is available on Grid'5000 frontends. You can check if the storage5k command is installed by running:

frontend: storage5k -v

An individual Storage5k resource is called a "chunk", and represents the smallest allocatable unit of storage. A typical chunk size is 10GB but it may vary among Grid'5000 sites. To display the chunks size, use the following command:

frontend: storage5k -a chunk_size

Reservation

Let's say that you need 50GB of space during one day to carry out your experiment. To reserve this storage, use:

frontend: storage5k -a add -l chunks=number,walltime=24

You can get information on your reserved storage with:

frontend: storage5k -a info

Note the Job_Id field. Actually, a Storage5k reservation is an OAR job ! You can use any of the OAR tools, such as:

frontend: oarstat -f -j storage_job_id

An other important field is Source nfs. It displays the NFS mount point where your storage is exported. Note that your storage is already available from the frontend:

frontend: ls /data/username_storage_job_id

frontend: cp my_big_data /data/username_storage_job_id/

Access to the storage

Nodes you reserve will automatically have access to the storage. For instance, try this:

frontend: oarsub -l nodes=3 -I

node: ls /data/username_storage_job_id/
lost+found my_big_data

node: oarsh othernode

othernode: ls /data/username_storage_job_id/
lost+found my_big_data

Access in deploy jobs

Though there is an exception: Deployed nodes do not mount your reserved storage by default. Let's try:

frontend: oarsub -l nodes=3 -t deploy -I

frontend: kadeploy3 -e debian9-x64-nfs -f $OAR_NODE_FILE -k

frontend: ssh node

node: ls /data
ls: cannot access /data: No such file or directory

node: exit

For this situation, Storage5k provides you with a way to mount the reserved space into your nodes. This command will mount your storage in all the nodes belonging to the job nodes_job_id

frontend: storage5k -a mount -j nodes_job_id

frontend: ssh node

node: ls /data/username_storage_job_id/
lost+found my_big_data

You can umount storage with:

frontend: storage5k -a umount -j nodes_job_id

Manual setup of the access

With the knowledge of the Source nfs entry, you can manually mount your storage as well:

frontend: ssh root@node   #As root

root@node: mount storage5k.lyon.grid5000.fr:/data/username_storage_job_id /mnt

root@node: exit

frontend: ssh node        #As normal user

node: ls /mnt/
lost+found my_big_data

node: exit
Warning.png Warning

Accessing storage in deployed nodes requires the nfs-common package. It is included by default in the nfs, big and std environments