Grid5000 Metadata Bundler

From Grid5000
Revision as of 10:27, 19 July 2021 by Lbertot (talk | contribs)
Jump to navigation Jump to search
Note.png Note

This page is actively maintained by the Grid'5000 team. If you encounter problems, please report them (see the Support page). Additionally, as it is a wiki page, you are free to make minor corrections yourself if needed. If you would like to suggest a more fundamental change, please contact the Grid'5000 team.

This page summarize what you need to know about g5k-metadata-bundler.

Introduction

When running experiments on Grid'5000, users generate metadata across multiple services. This metadata is useful for reproducibility purposes or scientific dissemination. The g5k-metadata-bundler is a service designed to retrieve metadata across all the different services and bundle them in a single archive. The bundle only retrieves metadata generated by Grid'5000 services, the collection of data generated by the users experiment is beyond the scope of this application.

Warning.png Warning

This service is in beta and not yet feature complete.

Usage

G5k-metadata-bundler is installed on every site frontend in Grid'5000 it can only be executed from the site frontends.

g5k-metadata-bundler -s SITE -j JOBID [-o OUTPUT]
  -v, --version                    Print g5k-metadata-bundler version
  -s, --job-site SITE              [MANDATORY] Grid'5000 site from which to extract
  -j, --job-id JID                 [MANDATORY] Job id of the OAR jod to extract
  -o, --output OUT                 Bundle name to use for the directory/archive

Users do not need to operate the bundler on the same frontend as the site the jobs was executed on. The bundler download all data pertaining to the queried job and bundle in a archive named code g5k-bundle-SITE-JID.tar.gz or if an output name has been provided OUTPUT.tar.gz. The bundle is provided in as a tar.gz archive which can be manipulated by using the following commands:

  • Listing
    tar -tzf OUTPUT.tar.gz lists all files contained within the bundle
    Extraction
    tar -xzf OUTPUT.tar.gz extracts all files to a directory with the same name as the bundle

Users operating on older versions of Windows might require thrid party software to unpack the bundle. (often 7-zip)

Example usage

user@fsophia:~$ g5k-metadata-bundler -s nancy -j 3003030
 Running g5k-metadata-bundler for job 3003030 at nancy
 Downloading https://api.grid5000.fr/stable/sites/nancy/jobs/3003030
 Downloading https://api.grid5000.fr/stable/sites/nancy/clusters/graoully/nodes/graoully-1?version=7f6b81c2621c6ed3a4fac632f213436813495755
 Downloading https://api.grid5000.fr/stable/?version=7f6b81c2621c6ed3a4fac632f213436813495755&deep=true
 Downloading https://api.grid5000.fr/stable/sites/nancy/metrics?job_id=3003030&nodes=graoully-1
 Generating README
 Compressing bundle
 Bundle created at g5k-bundle-nancy-3003030.tar.gz
user@fsophia:~$ ls -lh g5k-bundle-nancy-3003030.tar.gz
 -rw-r--r-- 1 user g5k-users 456K Jul 19 09:50 g5k-bundle-nancy-3003030.tar.gz
user@fsophia:~$ tar -tzf g5k-bundle-nancy-3003030.tar.gz
 g5k-bundle-nancy-3003030/
 g5k-bundle-nancy-3003030/g5k-oarjob-nancy-3003030.json
 g5k-bundle-nancy-3003030/README
 g5k-bundle-nancy-3003030/g5k-resource-nancy-graoully-1-7f6b81c2621c6ed3a4fac632f213436813495755.json
 g5k-bundle-nancy-3003030/g5k-monitoring-nancy-graoully-1-3003030.json
 g5k-bundle-nancy-3003030/g5k-refapi-7f6b81c2621c6ed3a4fac632f213436813495755.json

Bundle contents