Ubuntu-ARM64-L4T environment

From Grid5000
Revision as of 16:57, 21 November 2023 by Pneyron (talk | contribs)
Jump to navigation Jump to search

This tutorial describes the build of a kadeploy environment for the Nvidia AGX Xavier SoC nodes of the estats cluster in Toulouse. The operating system for this environment is Nvidia's Ubuntu 20.04 with Linux for Tegra.

Deploying Nvidia's supported Ubuntu 20.04 with L4T allows one to easily install Nvidia's tools such as the Tegra-specific version of Cuda (which is not provided in the default Debian environment).

An Ubuntu ARM64 system including Nvidia's L4T

The build of that environment is described in a kameleon recipe, which follows closely what Nvidia documents (although not using Nvidia's sdkmanager, which is cumbersome and in the end optional), with only minimal tricks.

The build, as described in the recipe, does the following:

  • start a Docker container running Ubuntu 20.04 (x86)
  • install Nivdia's Linux for Tegra inside + the CTI Board Support Package required by the estate hardware
  • install Grid'5000 Ubuntu 20.04 kadeploy environment for aarch64 with nfs support in the Linux for Tegra rootfs directory, and run the Linux for Tegra tools to transform that rootfs in an rootfs suitable for the SoC
  • do some additional installation for additional packages such as the jetpack or jeston-stats tool
  • finally: export the rootfs as a new kadeploy environment.

More details in the recipe itself: https://gitlab.inria.fr/grid5000/environments-recipes/-/blob/l4t/from_grid5000_environment/ubuntu2004-arm64-nfsl4t.yaml

Prebuilt environments

2 kadeploy environments for Ubuntu 20.04 ARM64 with L4T are prebuilt and directly usable on the Toulouse site:

  • ubuntu2004-nfsl4t, which is a minimal system (few tools installed) but with Grid'5000 user accounts and NFS supported.
  • ubuntu2004-nfsl4tfull, which add to ubuntu2004-nfsl4t the whole Nvidia's jetpack (Tegra's Cuda and many more) and the jetson-stats tool (jtop).

(see the kaenv command output in Toulouse for more details)

Example of a kadeploy command to deploy a prebuilt environment on reserved estate nodes:

pneyron@ftoulouse:~$ oarsub -q testing -t exotic -t deploy -p estate -l nodes=2 -I
...
pneyron@ftoulouse:~$ kadeploy3 ubuntu2004-arm64-nfsl4tfull -f $OAR_NODE_FILE
...

Build a customized environment

The following sections of this document explain how to extend the ubuntu2004-nfsl4t kadeploy environment recipe, in order to build a kadeploy environment customized for specific needs. It uses the kameleon tool.

For more information about Kameleon, see:

Prepare the kameleon tool

First, get a job on any x86 node

The build will run on a x86 node, even though the target system is an aarch64 node (Nvidia's L4T installation scripts use qemu for that purpose).

pneyron@ftoulouse:~$ oarsub -q testing -p montcalm -I
# Filtering out exotic resources (estats).
# Set walltime to default (3600 s).
OAR_JOB_ID=439704
# Interactive mode: waiting...
# Starting...
pneyron@montcalm-7:~$
Add the kameleon templates repository for the Grid'5000 environment recipes

If not already added, do as follows:

pneyron@montcalm-7:~$ kameleon repo add grid5000 https://gitlab.inria.fr/grid5000/environments-recipes
Cloning into '/home/pneyron/.kameleon.d/repos/grid5000'...
warning: redirecting to https://gitlab.inria.fr/grid5000/environments-recipes.git/
remote: Enumerating objects: 23035, done.
remote: Counting objects: 100% (2467/2467), done.
remote: Compressing objects: 100% (434/434), done.
remote: Total 23035 (delta 2004), reused 2431 (delta 1987), pack-reused 20568
Receiving objects: 100% (23035/23035), 2.83 MiB | 21.79 MiB/s, done.
Resolving deltas: 100% (14477/14477), done.

Else, just update it:

pneyron@montcalm-7:~$ kameleon repo update grid5000
...

Create the environment recipe

Create the debiantesting-arm64-custom recipe, which extends from_grid5000_environment/base
pneyron@montcalm-7:~$ mkdir my-ubuntu2004-arm64-nfsl4t
pneyron@montcalm-7:~$ cd my-ubuntu2004-arm64-nfsl4t
pneyron@montcalm-7:~/my-ubuntu2004-arm64-nfsl4t$ kameleon new my-ubuntu2004-arm64-nfsl4t grid5000/from_grid5000_environment/ubuntu2004-arm64-nfsl4t
      create  grid5000/from_grid5000_environment/ubuntu2004-arm64-nfsl4t.yaml
      create  grid5000/from_grid5000_environment/base.yaml
      create  grid5000/steps/backend/docker.yaml
      create  grid5000/steps/backend/chroot.yaml
      create  grid5000/steps/aliases/defaults.yaml
      create  grid5000/steps/checkpoints/docker.yaml
      create  grid5000/steps/bootstrap/prepare_docker.yaml
      create  grid5000/steps/bootstrap/start_docker.yaml
      create  grid5000/steps/bootstrap/download_upstream_tarball.yaml
      create  grid5000/steps/bootstrap/extract_upstream_tarball_in_rootfs.yaml
      create  grid5000/steps/disable_checkpoint.yaml
      create  grid5000/steps/export/export_docker_rootfs.yaml
      create  grid5000/steps/export/export_docker_image.yaml
      create  grid5000/steps/export/export_modified_g5k_env.yaml
      create  grid5000/steps/data/helpers/kaenv-customize.py
      create  grid5000/steps/env/bashrc
      create  grid5000/steps/env/functions.sh
      create  my-ubuntu2004-arm64-nfsl4t.yaml

The grid5000/from_grid5000_environment/ubuntu2004-arm64-nfsl4t recipe template does all the hard work and tricks to build an Ubuntu + L4T system. You may look at it. The just-created recipe extends it, and exposes only what's needed for some final customizations.

Adapt the recipe to your needs

Edit the generated recipe file: my-ubuntu2004-arm64-nfsl4t.yaml. Required changes are high-lighted.

#==============================================================================
# vim: softtabstop=2 shiftwidth=2 expandtab fenc=utf-8 cc=81 tw=80
#==============================================================================
#
# DESCRIPTION: Recipe to build a customized ARM64 ubuntu 20.04 L4T system
#
#==============================================================================
# This recipe extends another. To look at the step involed, run:
#   kameleon build -d my-ubuntu2004-arm64-nfsl4t.yaml
# To see the variables that you can override, use the following command:
#   kameleon info my-ubuntu2004-arm64-nfsl4t.yaml
---
extend: grid5000/from_grid5000_environment/ubuntu2004-arm64-nfsl4t.yaml

global:
  grid5000_environment_export_name: "my-ubuntu2004-arm64-nfsl4t"
  grid5000_environment_export_author: "jdoe"
  # To install additional Ubuntu packages, just uncomment and modify the following line.
  #extra_deb_packages: "nvidia-jetpack"
  # To install some Python packages from pipy, just uncomment and modify the following line.
  #extra_pip3_packages: "jetson-stats"

bootstrap:
  # This section should not be modified
  - "@base"

setup:
  # This section does not need to be modified if you just want to install additional Ubuntu or pip packages
  # Just set the variables above.
  - "@base"
  # If other customizations are needed, additional recipe steps can be added below this line.

export:
  # This section should not be modified
  - "@base"

Uncommenting line 19 and 21 will install Nvidia's Jetpack (all tools including Cuda) and jetson-stats which provides the jtop command. This is just an example.

Build the environment

The build requires qemu-user-static and docker, thus we first install them:

pneyron@montcalm-7:~/my-ubuntu2004-arm64-nfsl4t$ sudo-g5k apt-get install qemu-user-static
...
pneyron@montcalm-7:~/my-ubuntu2004-arm64-nfsl4t$ g5k-setup-docker -t
...

We can now launch the build:

pneyron@montcalm-7:~/my-ubuntu2004-arm64-nfsl4t$ kameleon build my-ubuntu2004-arm64-nfsl4t.yaml 
Creating kameleon build directory : /home/pneyron/my-ubuntu2004-arm64-nfsl4t/build/my-ubuntu2004-arm64-nfsl4t
Starting build recipe 'my-ubuntu2004-arm64-nfsl4t.yaml'
Step 1 : bootstrap/_init_bootstrap/_init_0_disable_checkpoint
--> Running the step...
Starting command: "bash"
[local] The local_context has been initialized
Step _init_bootstrap took: 0 secs
Step 2 : bootstrap/prepare_docker/pull_and_tag_base_image
--> Running the step...
...

Our environment is now ready (in our public web space) to be deployed on estats nodes!

  • environment description file: $HOME/public/my-ubuntu2004-arm64-nfsl4t.dsc
  • environment image: $HOME/public/my-ubuntu2004-arm64-nfsl4t.tar.zst

If we are done with the work on the environment creation. We can quit the current job and create a new one of type deploy, for the deployment (bare-metal) the environment on an estats node.

Deploy the environment

Reserve an estats node for 2h in a deploy job

(The estats cluster is currently in the testing queue, and is flagged as "exotic".)

pneyron@ftoulouse:~$ oarsub -I -t deploy -t exotic -q testing -p estats -l nodes=1,walltime=2
# Include exotic resources in the set of reservable resources (this does NOT exclude non-exotic resources).
OAR_JOB_ID=1125125
# Interactive mode: waiting...
# Starting...
[OAR] OAR_JOB_ID=1125125
[OAR] Your nodes are:
      estats-4.toulouse.grid5000.fr*64

pneyron@ftoulouse:~$
Open the console of the node

This is optional but allows to follow what's going on.

pneyron@ftoulouse:~$ kaconsole3 estats-4

No action is required in the console (any action during the boot/deploy process may jeopardize the deployment).

Deploy the environment on the node
pneyron@ftoulouse:~$ kadeploy3 -a ~/public/my-ubuntu2004-arm64-nfsl4t.dsc -m estats-4
Deployment #D-ea5e7cb0-da6a-4b46-8726-f9f41ee2c115 started
Grab the key file ...
Grab the tarball file ...
Grab the postinstall file ...
Launching a deployment on estats-4.toulouse.grid5000.fr
...
...
...
End of step Deploy[BootNewEnvHardReboot] after 196s
End of deployment for estats-4.toulouse.grid5000.fr after 526s
End of deployment on cluster estats after 526s
Deployment #D-ea5e7cb0-da6a-4b46-8726-f9f41ee2c115 done

The deployment is successful on nodes
estats-4.toulouse.grid5000.fr

We can now ssh to the estats-4 node.