Ubuntu-ARM64-L4T environment: Difference between revisions

From Grid5000
Jump to navigation Jump to search
No edit summary
No edit summary
Line 2: Line 2:
{{Portal|Tutorial}}
{{Portal|Tutorial}}
__TOC__
__TOC__
This tutorial describes the build of a kadeploy environment for the Nvidia AGX Xavier SoC nodes of the estats cluster in Toulouse. The operating system for this environment is Nvidia's Ubuntu 20.04 with Linux for Tegra.
This tutorial describes the build of a kadeploy environment for the Nvidia AGX Xavier SoC nodes of the estats cluster in Toulouse. The operating system installed in this environment is Nvidia's Ubuntu 20.04 with Linux for Tegra.


Deploying Nvidia's supported Ubuntu 20.04 with L4T allows one to easily install Nvidia's tools such as the Tegra-specific version of Cuda (which is not provided in the default Debian environment).
Deploying Nvidia's supported Ubuntu 20.04 with L4T allows one to easily install Nvidia's tools such as the Tegra-specific version of Cuda (it is not provided in the environment deployed by default on the estats nodes, which is a Debian that supports the Grid'5000 stack with the oar-node service, g5k-check, sudo-g5k, etc).


= An Ubuntu ARM64 system including Nvidia's L4T =
= An Ubuntu ARM64 system including Nvidia's L4T =
The build of that environment is described in a kameleon recipe, which follows closely what Nvidia documents (although not using Nvidia's sdkmanager, which is cumbersome and in the end optional), with only minimal tricks.
Grid'5000 provides the <code class=file>ubuntu2004-arm64-nfsl4t</code> environment, which supports the Grid'5000 account (ldap & nfs) additionally to the L4T specific services and Nvidia Jetpack (Cuda & more).
This environment complies to the build process of all other Grid'5000 supported environments. (It uses <code class=command>qemu-system</code> and builds on the pyxis cluster only, with some hacks).


The build, as described in the recipe, does the following:
{{Note|text=Only environments that include "l4t" in their name are compatible with the estats cluster. Other ARM64 environments are not supported on estats, they do not boot.}}
* start a Docker container running Ubuntu 20.04 (x86)
* install Nivdia's Linux for Tegra inside + the CTI Board Support Package required by the estate hardware
* install Grid'5000 Ubuntu 20.04 kadeploy environment for aarch64 with nfs support in the Linux for Tegra rootfs directory, and run the Linux for Tegra tools to transform that rootfs in an rootfs suitable for the SoC
* do some additional installation for additional packages such as the jetpack or jeston-stats tool
* finally: export the rootfs as a new kadeploy environment.


More details in the recipe itself: https://gitlab.inria.fr/grid5000/environments-recipes/-/blob/l4t/from_grid5000_environment/ubuntu2004-arm64-nfsl4t.yaml
Furthermore, Grid'5000 provides the <code class=file>from_grid5000_environment/ubuntu2004-arm64-nfsl4tcustom.yaml</code> kameleon recipe template that allows one to build a custom Ubuntu 20.04 for the estats cluster.
That kameleon recipe follows more closely what Nvidia documents (although not using Nvidia's sdkmanager, which is cumbersome and optional), with only minimal required tricks.


= Prebuilt environments =
The build, as described in the recipe template, must be run on an X86 machine and does the following:
2 kadeploy environments for Ubuntu 20.04 ARM64 with L4T are prebuilt and directly usable on the Toulouse site:
* Start a Docker container running Ubuntu 20.04 (X86).
* ubuntu2004-nfsl4t, which is a minimal system (few tools installed) but with Grid'5000 user accounts and NFS supported.
* Install Nivdia's Linux for Tegra inside + the CTI Board Support Package (required by the estate hardware).
* ubuntu2004-nfsl4tfull, which add to ubuntu2004-nfsl4t the whole Nvidia's jetpack (Tegra's Cuda and many more) and the jetson-stats tool (jtop).
* Extract the tarball image of the Grid'5000 Ubuntu 20.04 environment for ARM64 with nfs support in the Linux for Tegra rootfs directory
(see the kaenv command output in Toulouse for more details)
* Run the Linux for Tegra tool that adds to the rootfs everything that is required for the SoC to function.
* Optionally installs some additional packages (deb or pip).
* Finally: export the rootfs as a new kadeploy environment for estats (ARM64).


Example of a kadeploy command to deploy a prebuilt environment on reserved estate nodes:
More details in the recipe itself: https://gitlab.inria.fr/grid5000/environments-recipes/-/blob/l4t/from_grid5000_environment/ubuntu2004-arm64-nfsl4tcustom.yaml
 
= Deploying the prebuilt ubuntu2004-arm64-nfsl4t environment =
 
Example of a kadeploy command to deploy the prebuilt environment on a set of reserved estate nodes:
<syntaxhighlight lang="bash" highlight="1">
<syntaxhighlight lang="bash" highlight="1">
pneyron@ftoulouse:~$ oarsub -q testing -t exotic -t deploy -p estate -l nodes=2 -I
pneyron@ftoulouse:~$ oarsub -q testing -t exotic -t deploy -p estate -l nodes=3 -I
...
...
pneyron@ftoulouse:~$ kadeploy3 ubuntu2004-arm64-nfsl4tfull -f $OAR_NODE_FILE
pneyron@ftoulouse:~$ kadeploy3 ubuntu2004-arm64-nfsl4tfull -f $OAR_NODE_FILE
Line 33: Line 36:


= Build a customized environment =
= Build a customized environment =
The following sections of this document explain how to extend the ubuntu2004-nfsl4t kadeploy environment recipe, in order to build a kadeploy environment customized for specific needs. It uses the kameleon tool.
The following sections of this document explain how to extend the <code class=file>from_grid5000_environment/ubuntu2004-arm64-nfsl4t</code> recipe template, in order to build a kadeploy environment customized for a specific need. It uses the kameleon tool.


For more information about Kameleon, see:
For more information about Kameleon, see:
Line 40: Line 43:


== Prepare the kameleon tool ==
== Prepare the kameleon tool ==
; First, get a job on any x86 node
; First, get a job on any X86 node
The build will run on a x86 node, even though the target system is an aarch64 node (Nvidia's L4T installation scripts use qemu for that purpose).
The build will run on an X86 node, even though the target system is an ARM64 (aarch64) node (Nvidia's L4T installation scripts use <code class=command>qemu-user</code> for that purpose).
<syntaxhighlight lang="bash" highlight="1">
<syntaxhighlight lang="bash" highlight="1">
pneyron@ftoulouse:~$ oarsub -q testing -p montcalm -I
pneyron@ftoulouse:~$ oarsub -q testing -p montcalm -I
Line 73: Line 76:
== Create the environment recipe ==
== Create the environment recipe ==


; Create the debiantesting-arm64-custom recipe, which extends from_grid5000_environment/base
; Create the <code class=file>my-ubuntu-for-estats</code> recipe, which extends the <code class=file>from_grid5000_environment/ubuntu2004-arm64-nfsl4tcustom</code> template
<syntaxhighlight lang="bash" highlight="1-3">
<syntaxhighlight lang="bash" highlight="1-3">
pneyron@montcalm-7:~$ mkdir my-ubuntu2004-arm64-nfsl4t
pneyron@montcalm-7:~$ mkdir -p my_kameleon_recipes && cd my_kameleon_recipes
pneyron@montcalm-7:~$ cd my-ubuntu2004-arm64-nfsl4t
pneyron@montcalm-7:~/my_kameleon_recipes$ kameleon new my-ubuntu-for-estats grid5000/from_grid5000_environment/ubuntu2004-arm64-nfsl4tcustom
pneyron@montcalm-7:~/my-ubuntu2004-arm64-nfsl4t$ kameleon new my-ubuntu2004-arm64-nfsl4t grid5000/from_grid5000_environment/ubuntu2004-arm64-nfsl4t
       create  grid5000/from_grid5000_environment/ubuntu2004-arm64-nfsl4tcustom.yaml
       create  grid5000/from_grid5000_environment/ubuntu2004-arm64-nfsl4t.yaml
       create  grid5000/from_grid5000_environment/base.yaml
       create  grid5000/from_grid5000_environment/base.yaml
       create  grid5000/steps/backend/docker.yaml
       create  grid5000/steps/backend/docker.yaml
Line 95: Line 97:
       create  grid5000/steps/env/bashrc
       create  grid5000/steps/env/bashrc
       create  grid5000/steps/env/functions.sh
       create  grid5000/steps/env/functions.sh
       create  my-ubuntu2004-arm64-nfsl4t.yaml
       create  my-ubuntu-for-estats.yaml
</syntaxhighlight>
</syntaxhighlight>
The <code class=file>grid5000/from_grid5000_environment/ubuntu2004-arm64-nfsl4t</code> recipe template does all the hard work and tricks to build an Ubuntu + L4T system. You may look at it. The just-created recipe extends it, and exposes only what's needed for some final customizations.
The <code class=file>grid5000/from_grid5000_environment/ubuntu2004-arm64-nfsl4tcustom</code> recipe template does all the hard work and tricks to build an Ubuntu + L4T system. You may look at it. The just-created recipe extends it, and exposes only what's needed for some final customizations.


; Adapt the recipe to your needs
; Adapt the recipe to your needs
Edit the generated recipe file: <code class="file">my-ubuntu2004-arm64-nfsl4t.yaml</code>. Required changes are high-lighted.
Edit the generated recipe file: <code class="file">my-ubuntu-for-estats.yaml</code>. Required changes are highlighted.


<syntaxhighlight lang="yaml" highlight="18-21,31" line>
<syntaxhighlight lang="yaml" highlight="18-21,31" line>
#==============================================================================
# vim: softtabstop=2 shiftwidth=2 expandtab fenc=utf-8 cc=81 tw=80
#==============================================================================
#==============================================================================
#
#
Line 110: Line 110:
#
#
#==============================================================================
#==============================================================================
# This recipe extends another. To look at the step involed, run:
# This recipe extends another. To look at the step involved, run:
#  kameleon build -d my-ubuntu2004-arm64-nfsl4t.yaml
#  kameleon build -d my-ubuntu-for-estats.yaml
# To see the variables that you can override, use the following command:
# To see the variables that you can override, use the following command:
#  kameleon info my-ubuntu2004-arm64-nfsl4t.yaml
#  kameleon info my-ubuntu-for-estats.yaml
---
---
extend: grid5000/from_grid5000_environment/ubuntu2004-arm64-nfsl4t.yaml
extend: grid5000/from_grid5000_environment/ubuntu2004-arm64-nfsl4tcustom.yaml


global:
global:
   grid5000_environment_export_name: "my-ubuntu2004-arm64-nfsl4t"
   grid5000_environment_export_name: "my-ubuntu-for-estats"
   grid5000_environment_export_author: "jdoe"
   grid5000_environment_export_author: "jdoe"
   # To install additional Ubuntu packages, just uncomment and modify the following line.
   # To install additional Ubuntu packages, just uncomment and modify the following line.
Line 140: Line 140:
</syntaxhighlight>
</syntaxhighlight>


Uncommenting line 19 and 21 will install Nvidia's Jetpack (all tools including Cuda) and jetson-stats which provides the <code class=command>jtop</code> command. This is just an example.
Uncommenting lines 19 and 21 will install Nvidia's Jetpack (all tools including Cuda) and jetson-stats which provides the <code class=command>jtop</code> command. This is just an example.


== Build the environment ==
== Build the environment ==
Line 153: Line 153:
We can now launch the build:
We can now launch the build:
<syntaxhighlight lang="bash" highlight="1">
<syntaxhighlight lang="bash" highlight="1">
pneyron@montcalm-7:~/my-ubuntu2004-arm64-nfsl4t$ kameleon build my-ubuntu2004-arm64-nfsl4t.yaml  
pneyron@montcalm-7:~/my_kameleon_recipes$ kameleon build my-ubuntu-for-estats.yaml  
Creating kameleon build directory : /home/pneyron/my-ubuntu2004-arm64-nfsl4t/build/my-ubuntu2004-arm64-nfsl4t
Creating kameleon build directory : /home/pneyron/my-ubuntu-for-estats/build/my-ubuntu-for-estats
Starting build recipe 'my-ubuntu2004-arm64-nfsl4t.yaml'
Starting build recipe 'my-ubuntu-for-estats.yaml'
Step 1 : bootstrap/_init_bootstrap/_init_0_disable_checkpoint
Step 1 : bootstrap/_init_bootstrap/_init_0_disable_checkpoint
--> Running the step...
--> Running the step...
Line 167: Line 167:


Our environment is now ready (in our public web space) to be deployed on estats nodes!
Our environment is now ready (in our public web space) to be deployed on estats nodes!
* environment description file: <code class=file>$HOME/public/my-ubuntu2004-arm64-nfsl4t.dsc</code>
* environment description file: <code class=file>$HOME/public/my-ubuntu-for-estats.dsc</code>
* environment image: <code class=file>$HOME/public/my-ubuntu2004-arm64-nfsl4t.tar.zst</code>
* environment image: <code class=file>$HOME/public/my-ubuntu-for-estats.tar.zst</code>


If we are done with the work on the environment creation. We can quit the current job and create a new one of type '''deploy''', for the deployment (bare-metal) the environment on an estats node.
If we are done with the work on the environment creation. We can quit the current job and create a new one of type '''deploy''', for the deployment (bare-metal) the environment on an estats node.
Line 197: Line 197:
; Deploy the environment on the node
; Deploy the environment on the node
<syntaxhighlight lang="bash" highlight="1">
<syntaxhighlight lang="bash" highlight="1">
pneyron@ftoulouse:~$ kadeploy3 -a ~/public/my-ubuntu2004-arm64-nfsl4t.dsc -m estats-4
pneyron@ftoulouse:~$ kadeploy3 -a ~/public/my-ubuntu-for-estats.dsc -m estats-4
Deployment #D-ea5e7cb0-da6a-4b46-8726-f9f41ee2c115 started
Deployment #D-ea5e7cb0-da6a-4b46-8726-f9f41ee2c115 started
Grab the key file ...
Grab the key file ...

Revision as of 18:14, 21 November 2023

This tutorial describes the build of a kadeploy environment for the Nvidia AGX Xavier SoC nodes of the estats cluster in Toulouse. The operating system installed in this environment is Nvidia's Ubuntu 20.04 with Linux for Tegra.

Deploying Nvidia's supported Ubuntu 20.04 with L4T allows one to easily install Nvidia's tools such as the Tegra-specific version of Cuda (it is not provided in the environment deployed by default on the estats nodes, which is a Debian that supports the Grid'5000 stack with the oar-node service, g5k-check, sudo-g5k, etc).

An Ubuntu ARM64 system including Nvidia's L4T

Grid'5000 provides the ubuntu2004-arm64-nfsl4t environment, which supports the Grid'5000 account (ldap & nfs) additionally to the L4T specific services and Nvidia Jetpack (Cuda & more). This environment complies to the build process of all other Grid'5000 supported environments. (It uses qemu-system and builds on the pyxis cluster only, with some hacks).

Note.png Note

Only environments that include "l4t" in their name are compatible with the estats cluster. Other ARM64 environments are not supported on estats, they do not boot.

Furthermore, Grid'5000 provides the from_grid5000_environment/ubuntu2004-arm64-nfsl4tcustom.yaml kameleon recipe template that allows one to build a custom Ubuntu 20.04 for the estats cluster. That kameleon recipe follows more closely what Nvidia documents (although not using Nvidia's sdkmanager, which is cumbersome and optional), with only minimal required tricks.

The build, as described in the recipe template, must be run on an X86 machine and does the following:

  • Start a Docker container running Ubuntu 20.04 (X86).
  • Install Nivdia's Linux for Tegra inside + the CTI Board Support Package (required by the estate hardware).
  • Extract the tarball image of the Grid'5000 Ubuntu 20.04 environment for ARM64 with nfs support in the Linux for Tegra rootfs directory
  • Run the Linux for Tegra tool that adds to the rootfs everything that is required for the SoC to function.
  • Optionally installs some additional packages (deb or pip).
  • Finally: export the rootfs as a new kadeploy environment for estats (ARM64).

More details in the recipe itself: https://gitlab.inria.fr/grid5000/environments-recipes/-/blob/l4t/from_grid5000_environment/ubuntu2004-arm64-nfsl4tcustom.yaml

Deploying the prebuilt ubuntu2004-arm64-nfsl4t environment

Example of a kadeploy command to deploy the prebuilt environment on a set of reserved estate nodes:

pneyron@ftoulouse:~$ oarsub -q testing -t exotic -t deploy -p estate -l nodes=3 -I
...
pneyron@ftoulouse:~$ kadeploy3 ubuntu2004-arm64-nfsl4tfull -f $OAR_NODE_FILE
...

Build a customized environment

The following sections of this document explain how to extend the from_grid5000_environment/ubuntu2004-arm64-nfsl4t recipe template, in order to build a kadeploy environment customized for a specific need. It uses the kameleon tool.

For more information about Kameleon, see:

Prepare the kameleon tool

First, get a job on any X86 node

The build will run on an X86 node, even though the target system is an ARM64 (aarch64) node (Nvidia's L4T installation scripts use qemu-user for that purpose).

pneyron@ftoulouse:~$ oarsub -q testing -p montcalm -I
# Filtering out exotic resources (estats).
# Set walltime to default (3600 s).
OAR_JOB_ID=439704
# Interactive mode: waiting...
# Starting...
pneyron@montcalm-7:~$
Add the kameleon templates repository for the Grid'5000 environment recipes

If not already added, do as follows:

pneyron@montcalm-7:~$ kameleon repo add grid5000 https://gitlab.inria.fr/grid5000/environments-recipes
Cloning into '/home/pneyron/.kameleon.d/repos/grid5000'...
warning: redirecting to https://gitlab.inria.fr/grid5000/environments-recipes.git/
remote: Enumerating objects: 23035, done.
remote: Counting objects: 100% (2467/2467), done.
remote: Compressing objects: 100% (434/434), done.
remote: Total 23035 (delta 2004), reused 2431 (delta 1987), pack-reused 20568
Receiving objects: 100% (23035/23035), 2.83 MiB | 21.79 MiB/s, done.
Resolving deltas: 100% (14477/14477), done.

Else, just update it:

pneyron@montcalm-7:~$ kameleon repo update grid5000
...

Create the environment recipe

Create the my-ubuntu-for-estats recipe, which extends the from_grid5000_environment/ubuntu2004-arm64-nfsl4tcustom template
pneyron@montcalm-7:~$ mkdir -p my_kameleon_recipes && cd my_kameleon_recipes
pneyron@montcalm-7:~/my_kameleon_recipes$ kameleon new my-ubuntu-for-estats grid5000/from_grid5000_environment/ubuntu2004-arm64-nfsl4tcustom
      create  grid5000/from_grid5000_environment/ubuntu2004-arm64-nfsl4tcustom.yaml
      create  grid5000/from_grid5000_environment/base.yaml
      create  grid5000/steps/backend/docker.yaml
      create  grid5000/steps/backend/chroot.yaml
      create  grid5000/steps/aliases/defaults.yaml
      create  grid5000/steps/checkpoints/docker.yaml
      create  grid5000/steps/bootstrap/prepare_docker.yaml
      create  grid5000/steps/bootstrap/start_docker.yaml
      create  grid5000/steps/bootstrap/download_upstream_tarball.yaml
      create  grid5000/steps/bootstrap/extract_upstream_tarball_in_rootfs.yaml
      create  grid5000/steps/disable_checkpoint.yaml
      create  grid5000/steps/export/export_docker_rootfs.yaml
      create  grid5000/steps/export/export_docker_image.yaml
      create  grid5000/steps/export/export_modified_g5k_env.yaml
      create  grid5000/steps/data/helpers/kaenv-customize.py
      create  grid5000/steps/env/bashrc
      create  grid5000/steps/env/functions.sh
      create  my-ubuntu-for-estats.yaml

The grid5000/from_grid5000_environment/ubuntu2004-arm64-nfsl4tcustom recipe template does all the hard work and tricks to build an Ubuntu + L4T system. You may look at it. The just-created recipe extends it, and exposes only what's needed for some final customizations.

Adapt the recipe to your needs

Edit the generated recipe file: my-ubuntu-for-estats.yaml. Required changes are highlighted.

#==============================================================================
#
# DESCRIPTION: Recipe to build a customized ARM64 ubuntu 20.04 L4T system
#
#==============================================================================
# This recipe extends another. To look at the step involved, run:
#   kameleon build -d my-ubuntu-for-estats.yaml
# To see the variables that you can override, use the following command:
#   kameleon info my-ubuntu-for-estats.yaml
---
extend: grid5000/from_grid5000_environment/ubuntu2004-arm64-nfsl4tcustom.yaml

global:
  grid5000_environment_export_name: "my-ubuntu-for-estats"
  grid5000_environment_export_author: "jdoe"
  # To install additional Ubuntu packages, just uncomment and modify the following line.
  #extra_deb_packages: "nvidia-jetpack"
  # To install some Python packages from pipy, just uncomment and modify the following line.
  #extra_pip3_packages: "jetson-stats"

bootstrap:
  # This section should not be modified
  - "@base"

setup:
  # This section does not need to be modified if you just want to install additional Ubuntu or pip packages
  # Just set the variables above.
  - "@base"
  # If other customizations are needed, additional recipe steps can be added below this line.

export:
  # This section should not be modified
  - "@base"

Uncommenting lines 19 and 21 will install Nvidia's Jetpack (all tools including Cuda) and jetson-stats which provides the jtop command. This is just an example.

Build the environment

The build requires qemu-user-static and docker, thus we first install them:

pneyron@montcalm-7:~/my-ubuntu2004-arm64-nfsl4t$ sudo-g5k apt-get install qemu-user-static
...
pneyron@montcalm-7:~/my-ubuntu2004-arm64-nfsl4t$ g5k-setup-docker -t
...

We can now launch the build:

pneyron@montcalm-7:~/my_kameleon_recipes$ kameleon build my-ubuntu-for-estats.yaml 
Creating kameleon build directory : /home/pneyron/my-ubuntu-for-estats/build/my-ubuntu-for-estats
Starting build recipe 'my-ubuntu-for-estats.yaml'
Step 1 : bootstrap/_init_bootstrap/_init_0_disable_checkpoint
--> Running the step...
Starting command: "bash"
[local] The local_context has been initialized
Step _init_bootstrap took: 0 secs
Step 2 : bootstrap/prepare_docker/pull_and_tag_base_image
--> Running the step...
...

Our environment is now ready (in our public web space) to be deployed on estats nodes!

  • environment description file: $HOME/public/my-ubuntu-for-estats.dsc
  • environment image: $HOME/public/my-ubuntu-for-estats.tar.zst

If we are done with the work on the environment creation. We can quit the current job and create a new one of type deploy, for the deployment (bare-metal) the environment on an estats node.

Deploy the environment

Reserve an estats node for 2h in a deploy job

(The estats cluster is currently in the testing queue, and is flagged as "exotic".)

pneyron@ftoulouse:~$ oarsub -I -t deploy -t exotic -q testing -p estats -l nodes=1,walltime=2
# Include exotic resources in the set of reservable resources (this does NOT exclude non-exotic resources).
OAR_JOB_ID=1125125
# Interactive mode: waiting...
# Starting...
[OAR] OAR_JOB_ID=1125125
[OAR] Your nodes are:
      estats-4.toulouse.grid5000.fr*64

pneyron@ftoulouse:~$
Open the console of the node

This is optional but allows to follow what's going on.

pneyron@ftoulouse:~$ kaconsole3 estats-4

No action is required in the console (any action during the boot/deploy process may jeopardize the deployment).

Deploy the environment on the node
pneyron@ftoulouse:~$ kadeploy3 -a ~/public/my-ubuntu-for-estats.dsc -m estats-4
Deployment #D-ea5e7cb0-da6a-4b46-8726-f9f41ee2c115 started
Grab the key file ...
Grab the tarball file ...
Grab the postinstall file ...
Launching a deployment on estats-4.toulouse.grid5000.fr
...
...
...
End of step Deploy[BootNewEnvHardReboot] after 196s
End of deployment for estats-4.toulouse.grid5000.fr after 526s
End of deployment on cluster estats after 526s
Deployment #D-ea5e7cb0-da6a-4b46-8726-f9f41ee2c115 done

The deployment is successful on nodes
estats-4.toulouse.grid5000.fr

We can now ssh to the estats-4 node.