Advanced Kadeploy: Difference between revisions

From Grid5000
Jump to navigation Jump to search
(47 intermediate revisions by 10 users not shown)
Line 2: Line 2:
{{Portal|Tutorial}}
{{Portal|Tutorial}}
{{TutorialHeader}}
{{TutorialHeader}}
{{Warning|text=Please mind also reading the [[Environments creation using Kameleon and Puppet]] tutorial, which gives automated mechanisms to build kadeploy environnements}}
= What you need to know before starting =
= What you need to know before starting =
The first thing to understand is that by using kadeploy3, you will be running a command that attempts to remotely reboot nodes, and boot them using configuration files hosted on a server, many nodes at a time. On some clusters, there is a failure rate associated with this operation that is not null. You might therefore experience failures on some operations during this tutorial. In this case, retry. The system doesn't retry for you as this implies waiting for long timeouts in all cases, even those where a 90% success rate is sufficient.
The first thing to understand is that by using kadeploy3, you will be running a command that attempts to remotely reboot many nodes at the same time, and boot them using configuration files hosted on a server. On some clusters, there is a failure rate associated with this operation that is not null. You might therefore experience failures on some operations during this tutorial. In this case, retry. The system doesn't retry for you as this implies waiting for long timeouts in all cases, even those where a 90% success rate is sufficient.


== What is an Environment? ==
== What is an Environment? ==
Line 11: Line 14:
file describing a fully functional Operating System.
file describing a fully functional Operating System.
To be able to setup a Operating System, <code class="command">kadeploy3</code>
To be able to setup a Operating System, <code class="command">kadeploy3</code>
needs at least 4 files in the most common cases:
needs at least 4 files in the most common cases
# An '''image'''
# An '''image'''
#* An image is a file containing all the Operating System files. It can be a compressed archive (ie tgz file) or a dump of a device (ie dd file). In this tutorial, you will learn to build new images for Kadeploy3
#* An image is a file containing all the Operating System files. It can be a compressed archive (ie tgz file) or a dump of a device (ie dd file). In this tutorial, you will learn to build new images for Kadeploy3
Line 30: Line 33:
== Search an environment ==
== Search an environment ==


Grid'5000 maintains several reference environments directly available on any site. These environments are based on various versions of debian. And for each debian version you will find different variants of reference environments.<br>
Grid'5000 maintains several reference environments directly available in all sites. These environments are based on Debian, Ubuntu and Centos.  
They are called ''reference'' environments because they can be used to generate customized environments. You will find different variants of reference environments, depending on which version of debian they are based on.<br>
 
The complete list all available environments, with their different variants, and the sites where they are available are listed on the wiki page:
For Debian, different variants of reference environments are offered. For Ubuntu and Centos, only environment with a minimal system are offered.
{{Link|[[:Category:Portal:Environment|Sidebar > Users Portal > Portal:Environment or Kadeploy environments]]}}
From that page there is a link for each variant of each reference environments to another page which gives a thorough description of  the environment content, how it was build and how to use it with kadeploy3. An example in the next link :
{{Link|[[Wheezy-x64-base-1.4|Sidebar > Users Portal > Kadeploy environments directory > Linux environments > Wheezy-x64-base-1.4]]}}


An environment library is maintained on each site in the <code class="dir">/grid5000</code> directory of the <code class="host">frontend</code> node. So all environments available on each site are stored in that directory.
They are called ''reference'' environments because they can be used to generate customized environments.


To deploy a registered environment, you must know its name as registered in the Kadeploy database. It is the first information on the environment description page. This tutorial uses the <code class="env">wheezy-x64-base</code> environment.
; The description of the reference environments can be found here : {{Link|[[Getting_Started#Deploying_nodes_with_Kadeploy]]}}
 
An environment registry is maintained in each site (see <code class="command">kaenv3</code>), with the associated filesystem images stored in the <code class="dir">/grid5000</code> directory of the <code class="host">frontend</code>.
 
To deploy a registered environment, you must know its name as registered in the Kadeploy database. It is the first information on the environment description page. This tutorial uses the <code class="env">debian10-x64-base</code> environment.


You can also list all available environment in a site by using the <code class="command">kaenv3</code> command :
You can also list all available environment in a site by using the <code class="command">kaenv3</code> command :
Line 49: Line 53:
* ''public'': All users can see those environments. Only administrators can tag them this way.
* ''public'': All users can see those environments. Only administrators can tag them this way.


* ''shared'': Every users can see the environment provided they use the -u option to specify the user the environment belong to.
* ''shared'': Every users can see the environment provided they use the -u option to specify the user the environment belongs to.


* ''private'': The environment is only visible by the user the environment belong to.
* ''private'': The environment is only visible by the user the environment belongs to.


For example, a shared environment added by user <code class="replace">user</code> is listed this way :
For example, a shared environment added by user <code class="replace">user</code> is listed this way :
{{Term|location=frontend|cmd=<code class="command">kaenv3</code> -l -u <code class="replace">user</code>}}
{{Term|location=frontend|cmd=<code class="command">kaenv3</code> -l -u <code class="replace">user</code>}}
You can also look for a specific version with the <code>--env-version</code> option. All the versions of the environments can be found in <code>/grid5000/images</code>. The version number is the last part of the tgz file.
For example : <code>debian10-x64-min-2019100414.tgz</code> => it's the min debian10-x64 reference environment version <code>2019100414</code>.


Being able to reproduce the experiments that are done is a desirable feature. Therefore, you should always try to control as much as possible the environment the experiment is done in. Therefore, we will attempt to check that the environment that was chosen in the environment directory is the one available on a given cluster. On the cluster you would like to deploy, type the following command to print information about an environment :
Being able to reproduce the experiments that are done is a desirable feature. Therefore, you should always try to control as much as possible the environment the experiment is done in. Therefore, we will attempt to check that the environment that was chosen in the environment directory is the one available on a given cluster. On the cluster you would like to deploy, type the following command to print information about an environment :
{{Term|location=frontend|cmd=<code class="command">kaenv3</code> <code>-p</code> <code class="env">wheezy-x64-base</code> <code>-u deploy</code>}}
{{Term|location=frontend|cmd=<code class="command">kaenv3</code> <code>-p</code> <code class="env">debian10-x64-base</code> <code>-u deploy</code>}}


You must specify the user option. In our case, all public environments belong to user <code>deploy</code>.
You must specify the user option. In our case, all public environments belong to user <code>deploy</code>.


Check that the <code>tarball</code> file is the expected one by checking its name and its checksum which you should find on the [[Wheezy-x64-base-1.4#Identification_sheet|identification sheet]]:
In theory, you should also check the post-install script. A post-install script adapts an environment to the site it is deployed on.  
{{Term|location=frontend|cmd=<code class="command">md5sum</code> <code class="file">/grid5000/images/wheezy-x64-base-1.4.tgz</code>}}


In theory, you should also check the post-install script. A post-install script adapts an environment to the site it is deployed on. In the same way as for environments, you should be able to find a description of the post-install script on pages such as [[Wheezy-x64-base-1.4#Recording_environment | here]]. Post-install scripts is an evolving matter, so don't be too worried if you don't find things as described here.
If everything seems ok, please proceed to the next step.
If everything seems ok, please proceed to the next step.


== Make a job on a deployable node ==
== Make a job on a deployable node ==
By default, Grid'5000 nodes are running on the ''production'' environment. Which already contains most of the important features and can be used to run experiments. But you will not have administrative privileges (root privileges) on these nodes. So you will not be able to customize these environments at will. In fact, only reference environments can be customized at will. But to have the right to deploy a reference environment on a node, you must supply the option <code>-t deploy</code> when submitting your job.
By default, Grid'5000 nodes are running on the ''production'' environment, which already contains most of the important features and can be used to run experiments. But you will not have administrative privileges (root privileges) on these nodes. So you will not be able to customize these environments at will. In fact, only reference environments can be customized at will. But to have the right to deploy a reference environment on a node, you must supply the option <code>-t deploy</code> when submitting your job.


For this part of the tutorial, job made will be interactive (<code>-I</code>), of the deploy type (<code>-t deploy</code>), on only one machine (<code>-l nodes=1</code>) to do environment customization (we will give ourselves 3 hours with <code>-l walltime=3</code>), which gives us the following command, that will open a new shell session on the frontend node:
For this part of the tutorial, job made will be interactive (<code>-I</code>), of the deploy type (<code>-t deploy</code>), on only one machine (<code>-l nodes=1</code>) to do environment customization (we will give ourselves 3 hours with <code>-l walltime=3</code>), which gives us the following command, that will open a new shell session on the frontend node:
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -I -t deploy -l nodes=1,walltime=3}}
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -I -t deploy -l nodes=1,walltime=3}}


Since all Grid'5000 nodes do not necessary have console access, it is recommended in the context of this tutorial to add the option <code>rconsole="YES"</code> in your reservation command.
Since all Grid'5000 nodes do not necessary have console access, it is recommended in the context of this tutorial to add the option <code>rconsole="YES"</code> to your reservation command.
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -I -t deploy -l '{rconsole="YES"}/nodes=1,walltime=3'}}
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -I -t deploy -l {"rconsole='YES'"}/nodes=1,walltime=3}}


Indeed, when you submit a job of the deploy type, a new shell is opened on the frontend node and not on the first machine of the job as for standard jobs. When you exit from this shell, the job ends. The shell is populated with <code class="env">OAR_*</code> environment variables. You should look at the list of available variables to get an idea of the information you can use to script deployment later. As usual, if the job is successfull, you will get the name of the machine allocated to your job with:
Indeed, when you submit a job of the deploy type, a new shell is opened on the frontend node and not on the first machine of the job as for standard jobs. When you exit from this shell, the job ends. The shell is populated with <code class="env">OAR_*</code> environment variables. You should look at the list of available variables to get an idea of the information you can use to script deployment later. As usual, if the job is successfull, you will get the name of the machine allocated to your job with:
{{Term|location=frontend|cmd=<code class="command">cat</code> <code class="env">$OAR_FILE_NODES</code>}}
{{Term|location=frontend|cmd=<code class="command">cat</code> <code class="env">$OAR_FILE_NODES</code>}}


{{Warning|text=At the end of a reservation with the options <code>-t deploy</code>, the reserved nodes will be restarted to boot on the production environment and thus be available to any other user.
{{Warning|text=At the end of a reservation with the options <code>-t deploy</code>, the reserved nodes will be restarted to boot on the standard environment and thus be available to any other user.
So you should only use this option <code>-t deploy</code> when you actually intend to deploy a reference environment on the reserved nodes.}}
So you should only use this option <code>-t deploy</code> when you actually intend to deploy a reference environment on the reserved nodes.}}


== Deploy a reference environment ==
== Deploy a reference environment ==
To deploy your environment, you must discover the nodes you were allocated by OAR. The simplest way of doing this is to look at the content of the file whose name is stored in <code class="env">$OAR_FILE_NODES</code> (this variable is labelled <code class="env">$OAR_NODE_FILE</code> too) or the messages displayed when the job was made. This variable <code class="env">$OAR_NODE_FILE</code> simply stores the url of the file containing the FQDN of all your reserved nodes. Deployment happens when you run the following command:
To deploy your environment, you must discover the nodes you were allocated by OAR. The simplest way of doing this is to look at the content of the file whose name is stored in <code class="env">$OAR_FILE_NODES</code> (this variable is labelled <code class="env">$OAR_NODE_FILE</code> too) or the messages displayed when the job was made. This variable <code class="env">$OAR_NODE_FILE</code> simply stores the url of the file containing the FQDN of all your reserved nodes. Deployment happens when you run the following command:
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e wheezy-x64-base -m <code class="replace">node.site</code>.grid5000.fr}}
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e debian10-x64-base -m <code class="replace">node.site</code>.grid5000.fr}}


You can automate this to deploy on all nodes of your job with Kadeploy3's -f option:
You can automate this to deploy on all nodes of your job with the -f option:
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e wheezy-x64-base -f <code class="env">$OAR_FILE_NODES</code>}}
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e debian10-x64-base -f <code class="env">$OAR_FILE_NODES</code>}}




If you want to be able to connect to the node as <code>root</code> without any password prompting you can use the <code>-k</code> option and proceed by two ways :
In order to be able to connect to the node (as <code>root</code>), you must use the <code>-k</code> option and proceed by two ways :
* You can either specify the public key that will be copied in <code class=file>/root/.ssh/authorized_keys</code> on the deployed nodes :
* You can either specify the public key that will be copied in <code class=file>/root/.ssh/authorized_keys</code> on the deployed nodes :
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e wheezy-x64-base -f <code class="env">$OAR_FILE_NODES</code> -k ~/.ssh/my_special_key.pub}}
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e debian10-x64-base -f <code class="env">$OAR_FILE_NODES</code> -k ~/.ssh/my_special_key.pub}}
* Or you can supply the <code>-k</code> option without argument. This will automatically copy your <code class=file>~/.ssh/authorized_keys</code> and replace the <code class=file>/root/.ssh/authorized_keys</code> file on the deployed nodes.  
* Or you can supply the <code>-k</code> option without argument. This will automatically copy your <code class=file>~/.ssh/authorized_keys</code> and replace the <code class=file>/root/.ssh/authorized_keys</code> file on the deployed nodes.  
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e <code>wheezy-x64-base</code> -f <code class="env">$OAR_FILE_NODES</code> -k}}
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e <code>debian10-x64-base</code> -f <code class="env">$OAR_FILE_NODES</code> -k}}
The second case is actually the simplest way. One of its advantages is that after deployments, you will be able to connect directly from your local computer to the deployed nodes, the same way you connect to the frontend of the site were those nodes are.<br>
The second case is actually the simplest way. One of its advantages is that after deployments, you will be able to connect directly from your local computer to the deployed nodes, the same way you connect to the frontend of the site were those nodes are.<br>
Once kadeploy has run successfully, the allocated node is deployed under <code>wheezy-x64-base</code> environment. It will then be possible to tune this environment according to your needs.
Once kadeploy has run successfully, the allocated node is deployed under <code>debian10-x64-base</code> environment. It will then be possible to tune this environment according to your needs.


{{Note|text=It is not necessary here, but you can specify destination partition with the -p option. You can find on the [[Node storage]] page all informations about the partitions table used on G5K}}
{{Note|text=It is not necessary here, but you can specify destination partition with the -p option. You can find on the [[Grid5000:Node storage]] page all informations about the partitions table used on G5K}}


== Connect to the deployed environment and customize it ==
== Connect to the deployed environment and customize it ==
'''1. Connection'''
'''1. Connection'''


On reference environments managed by the staff, you can use <code>root</code> account for login through <code>ssh</code> (kadeploy checks that sshd is running before declaring success of a deployment). To connect to the node type :
On reference environments managed by the staff, you can use <code>root</code> account for login through <code>ssh</code> (kadeploy checks that sshd is running before declaring a deployment successful). To connect to the node type :
{{Term|location=frontend|cmd=<code class="command">ssh</code> root@<code class="replace">node.site</code>.grid5000.fr}}
{{Term|location=frontend|cmd=<code class="command">ssh</code> root@<code class="replace">node.site</code>.grid5000.fr}}
{{Note|text=If you have not deployed the nodes with a public key using option <code>-k</code>, you will be asked for a password. Default root password for all reference environments is <code>grid5000</code>. Please check the environments descriptions.}}


In case this doesn't work, please take a look at the [[FAQ#Deployment related issues|kadeploy section]] of the [[FAQ|Sidebar > FAQ]]
In case this doesn't work, please take a look at the [[FAQ#Deployment related issues|kadeploy section]] of the [[FAQ|Sidebar > FAQ]]


'''2. Customization (example with authentification parameters)'''
'''2. Adding software to an environment'''
 
Using the <code class="env">root</code> account for all your experiments is possible, but you will probably be better off creating a user account. You could even create user accounts for all the users that would be using your environment. However, if the number of users is greater than 2 or 3, you should better configure an LDAP client by tuning the post-install script or using a ''fat'' version of this script (beyond the scope of this tutorial). Two ways of doing things.
 
The simplest is to create a dedicated account (e.g. the generic user <code class="env">g5k</code>) and move in all experiment data at the beginning and back at the end of an experiment, using <code class="command">scp</code> or <code class="command">rsync</code>. A more elaborate approach is to locally recreate our usual Grid'5000 account with the same uid/gid on our deployed environment. This second approach could simplify file rights management if you need to store temporary data on shared volumes.
 
To create your local unix group on your environment, first find your primary group on the <code class="host">frontend</code> node with:
{{Term|location=frontend|cmd=<code class="command">id</code>}}
 
The output of this command is for instance of the form:
<code>uid=19002(dmargery) gid=19000(rennes) groups=9998(CT),9999(grid5000),19000(rennes)</code>
Where :
<code class="replace">userId</code> = ''19002''
<code class="replace">userLogin</code> = ''dmargery''
<code class="replace">groupId</code> = ''19000''
<code class="replace">groupName</code> = ''rennes''
 
Then, as root, on the deployed machine:
{{Term|location=node|cmd=<code class="command">addgroup</code> --gid <code class="replace">groupId</code> <code class="replace">groupName</code>}}
 
Now, to enable access to your local user account on your environment, as root, on the deployed machine:
{{Term|location=node|cmd=<code class="command">adduser</code> --uid <code class="replace">userId</code> --ingroup <code class="replace">groupName</code> <code class="replace">userLogin</code>}}
 
Finally, as root, become the newly created user and place your ssh key:
{{Term|location=node|cmd=<code class="command">su</code> - <code class="replace">userLogin</code> <br> <code class="command">mkdir</code> ~/.ssh <br> <code class="command">exit</code> <br> <code class="command">cp</code> /root/.ssh/authorized_keys /home/<code class="replace">userLogin</code>/.ssh/ <br> <code class="command">chown</code> <code class="replace">userLogin</code>:<code class="replace">groupName</code> /home/<code class="replace">userLogin</code>/.ssh/authorized_keys}}
 
Now you can login to the node with your user account
{{Term|location=fronted|cmd=<code class="command">ssh</code> <code class="replace">userLogin</code>@<code class="replace">node.site</code>.grid5000.fr}}
 
'''3. Adding software to an environment'''


''Where you learn to install software using the package repository of your distribution on Grid'5000 (using proxys)...''
''Where you learn to install software using the package repository of your distribution on Grid'5000''


You can therefore update your environment (to add missing libraries that you need, or remove packages that you don't so that sizes down the image and speeds up the deployment process, etc.) using:
You can therefore update your environment (to add missing libraries that you need, or remove packages that you don't so that sizes down the image and speeds up the deployment process, etc.) using:
{{Term|location=node|cmd=<code class="command">apt-get</code> update <br> <code class="command">apt-get</code> upgrade <br> <code class="command">apt-get</code> install <code class="replace">list of desired packages and libraries</code> <br> <code class="command">apt-get</code> --purge remove <code class="replace">list of unwanted packages</code> <br> <code class="command">apt-get</code> clean}}
{{Term|location=node|cmd=<code class="command">apt-get</code> update <br> <code class="command">apt-get</code> upgrade <br> <code class="command">apt-get</code> install <code class="replace">list of desired packages and libraries</code> <br> <code class="command">apt-get</code> --purge remove <code class="replace">list of unwanted packages</code> <br> <code class="command">apt-get</code> clean}}
{{Note|text=On reference environments, <code class="command">apt-*</code> commands are automatically configured to use the proper proxy. But if you need an outside access for the HTTP, HTTPS and FTP protocols, with another command (<code class="command">wget</code>, <code class="command">git</code>,...), you will have to configure the proxy by following the documentation on the [[Web_proxy_client]] page. }}


== Create a new environment from a customized environment ==
== Create a new environment from a customized environment ==
Line 154: Line 127:
'''1. Use the provided tools'''
'''1. Use the provided tools'''


*You can use [[TGZ-G5K]], a script installed in all reference environments. You can find all instructions on how to use it on its [[TGZ-G5K]] page.
You can use <code class=command>tgz-g5k</code> to extract a Grid'5000 environment tarball from a running node.


Examples :
{{Term|location=frontend|cmd=<code class=command>tgz-g5k -m </code><code class=replace>node</code><code class=command> -f </code><code class=replace>~/path_to_myimage.tgz</code>}}
{{Term|location=frontend|cmd=<code class=command>ssh root@</code><code class=replace>node</code><code class=command> tgz-g5k > </code> <code class=replace>path_to_myimage.tgz</code>}}
 
or
This will create a file <code class=replace>path_to_image.tgz</code> into your home directory on <code class=host>frontend</code>.
{{Term|location=node|cmd=<code class=command>tgz-g5k</code> <code class=replace>login</code><code class=command>@frontend:</code><code class=replace>path_to_myimage.tgz</code>}}
 
This will create a file <code class=replace>path_to_image.tgz</code> into your home directory on <code class=host>frontend</code>. The first example is to be preferred, as it can ran password-less or passphrase-less without adding your private key to the image.
{{Note|text=Please consider the following:
*If you want to extract a tarball from the Grid'5000 standard environment (i.e., a non-deployed job), you will need to add the option <code class=command>-o</code> to use oarsh/oarcp instead of ssh/scp
*If you want <code class=command>tgz-g5k</code> to access the node with a custom user id, you can use the option <code class=command>-u myCustomeId</code> (default is root)
*You can find more information on <code class=command>tgz-g5k</code> (e.g., available options, command line examples) by executing <code class=command>tgz-g5k -h</code>. Some implementation details are also available on the man page (<code class=command>man tgz-g5k</code>).}}


'''2. Describe the newly created environment for deployments'''
'''2. Describe the newly created environment for deployments'''


Kadeploy3 works using an environment description. The easiest way to create a description for your new environment is to change the description of the environment it is based on. We have based this tutorial on the <code>wheezy-x64-base</code> environment of user <code>deploy</code>. We therefore print its description to a file that will be used as a good basis:
Kadeploy3 works using an environment description. The easiest way to create a description for your new environment is to change the description of the environment it is based on. We have based this tutorial on the <code>debian10-x64-base</code> environment of user <code>deploy</code>. We therefore print its description to a file that will be used as a good basis:
{{Term|location=frontend|cmd=<code class=command>kaenv3</code> -p wheezy-x64-base -u deploy > <code class=replace>mywheezy-x64-base.env</code> }}
{{Term|location=frontend|cmd=<code class=command>kaenv3</code> -p debian10-x64-base -u deploy > <code class=replace>mydebian10-x64-base.env</code> }}


It should be edited to change the <code class='replace'>name</code>, <code class='replace'>description</code>, <code class='replace'>author</code> lines, as well as the <code class='replace'>tarball</code> line. Since the tarball is local, the path should be a simple absolute path (without a leading <code>server://</code>). The visibility line should be removed, or changed to <code>shared</code> or <code>private</code>. Once this is done, the newly created environment can be deployed using:
It should be edited to change the <code class='replace'>name</code>, <code class='replace'>description</code>, <code class='replace'>author</code> lines, as well as the <code class='replace'>tarball</code> line. Since the tarball is local, the path should be a simple absolute path (without a leading <code>server://</code>). The visibility line should be removed, or changed to <code>shared</code> or <code>private</code>. Once this is done, the newly created environment can be deployed using:
{{Term|location=frontend|cmd=<code class=command>kadeploy3</code> -f <code class=env>$OAR_NODEFILE</code> -a <code class=replace>mywheezy-x64-base.env</code> }}
{{Term|location=frontend|cmd=<code class=command>kadeploy3</code> -f <code class=env>$OAR_NODEFILE</code> -a <code class=replace>mydebian10-x64-base.env</code> }}


This kind of deployment is called ''anonymous deployment'' because the description is not recorded into the Kadeploy3 database. It is particularly useful when you perform the tuning of your environment if you have to update the environment tarball several times.
This kind of deployment is called ''anonymous deployment'' because the description is not recorded into the Kadeploy3 database. It is particularly useful when you perform the tuning of your environment if you have to update the environment tarball several times.
Line 174: Line 150:
Once your customized environment is successfully tuned, you can save it to Kadeploy3 database so that you can directly deploy it with <code class=command>kadeploy3</code>, by specifying its name:
Once your customized environment is successfully tuned, you can save it to Kadeploy3 database so that you can directly deploy it with <code class=command>kadeploy3</code>, by specifying its name:


{{Term|location=frontend|cmd=<code class=command>kaenv3</code> -a <code class=replace>mywheezy-x64-base.env</code> }}
{{Term|location=frontend|cmd=<code class=command>kaenv3</code> -a <code class=replace>mydebian10-x64-base.env</code> }}
and then (if your environment is named "mywheezy-base"):
and then (if your environment is named "mydebian10-base"):
{{Term|location=frontend|cmd=<code class=command>kadeploy3</code> -f <code class=env>$OAR_NODEFILE</code> -e <code class=replace>mywheezy-base</code>}}
{{Term|location=frontend|cmd=<code class=command>kadeploy3</code> -f <code class=env>$OAR_NODEFILE</code> -e <code class=replace>mydebian10-base</code>}}


With <code class=command>kaenv3</code> command, you can manage your environments at your ease. Please refer to its documentation for an overview of its features.
With <code class=command>kaenv3</code> command, you can manage your environments at your ease. Please refer to its documentation for an overview of its features.
Line 193: Line 169:
* Download your the CD/DVD ISO of your OS installer (say <code class="replace">OS_ISO</code>) and upload it on the frontend of the target site [[SSH#Copying_files|help here]]
* Download your the CD/DVD ISO of your OS installer (say <code class="replace">OS_ISO</code>) and upload it on the frontend of the target site [[SSH#Copying_files|help here]]
{{Term|location=local|cmd=<code class="command">scp</code> <code class="replace">OS_ISO USERNAME</code>@<code class="host">access</code>.<code class="host">grid5000.fr</code>:<code class="replace">SITE</code>/}}
{{Term|location=local|cmd=<code class="command">scp</code> <code class="replace">OS_ISO USERNAME</code>@<code class="host">access</code>.<code class="host">grid5000.fr</code>:<code class="replace">SITE</code>/}}
'''Note''': In this tutorial we will use one Ubuntu server 64bits ISO image file (available Nancy frontend: /grid5000/iso/ubuntu-11.10-server-amd64.iso)


* Make a reservation with 1 node, with the deployment mode and the destructive mode (to be able to deploy on the temporary partition), two hours should be enough.
* Make a reservation with 1 node, with the deployment mode and the destructive mode (to be able to deploy on the temporary partition), two hours should be enough.
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -I -t deploy -t destructive -l nodes=1,walltime=<code class="replace">2</code>}}
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -I -t deploy -t destructive -l nodes=1,walltime=<code class="replace">2</code>}}


* Deploy a minimal system on the node's temporary partition, use your ssh key (since Grid'5000 is more Debian friendly, let's say Debian/wheezy)
* Deploy a minimal system on the node's temporary partition, use your ssh key (since Grid'5000 is more Debian friendly, let's say Debian 10 Buster)
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e <code class="env">wheezy-x64-min</code> -p 5 -k -f $OAR_NODEFILE}}
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e <code class="env">debian10-x64-min</code> -p 5 -k -f $OAR_NODEFILE}}


* Connect to the node and install the needed packages
* Connect to the node and install the needed packages
Line 208: Line 183:
*: Copy the OS's ISO to the node
*: Copy the OS's ISO to the node
{{Term|location=frontend|cmd=<code class="command">scp</code> <code class="replace">OS_ISO</code> root@<code class="replace">NODE</code>:}}
{{Term|location=frontend|cmd=<code class="command">scp</code> <code class="replace">OS_ISO</code> root@<code class="replace">NODE</code>:}}
*: Connect to the node and install ''vncviewer'', ''parted'' and ''kvm'' on the system.
*: Connect to the node and install ''tigervnc-viewer'', ''parted'' and ''kvm'' on the system.
First update the packages definition
First update the packages definition
{{Term|location=NODE|cmd=<code class="command">apt-get</code> -y update}}
{{Term|location=NODE|cmd=<code class="command">apt-get</code> -y update}}
Then install the needed packages
Then install the needed packages
{{Term|location=NODE|cmd=<code class="command">apt-get</code> install -y vncviewer parted kvm}}
{{Term|location=NODE|cmd=<code class="command">apt-get</code> install -y tigervnc-viewer parted qemu-kvm}}
*: Clean the old system's partition <code class="file">/dev/sda3</code>
*: Clean the old system's partition <code class="file">/dev/sda3</code>
{{Term|location=NODE|cmd=<code class="command">echo</code><nowiki> -e 'd\n3\nn\np\n\n\nt\n3\n0\nw\n' | fdisk /dev/sda && sync && partprobe /dev/sda</nowiki>}}
{{Term|location=NODE|cmd=<code class="command">echo</code><nowiki> -e 'd\n3\nn\np\n\n\nt\n3\n0\nw\n' | fdisk /dev/sda && sync && partprobe /dev/sda</nowiki>}}
Line 223: Line 198:
(The node has to be specified by basename: ''griffon-42.nancy.grid5000.fr'' &rarr; ''griffon-42'')
(The node has to be specified by basename: ''griffon-42.nancy.grid5000.fr'' &rarr; ''griffon-42'')
* Connect to the frontend using SSH X11 forwarding and get the screen of our virtual machine using VNC
* Connect to the frontend using SSH X11 forwarding and get the screen of our virtual machine using VNC
{{Term|location=local|cmd=<code class="command">ssh</code> -Xt <code class="replace">USERNAME</code>@access.grid5000.fr 'ssh -X root@<code class="replace">NODE.SITE</code> "vncviewer :1"'}}
{{Term|location=laptop|cmd=<code class="command">ssh</code> -Y root@<code class="replace">NODE</code>.<code class="replace">SITE</code>.g5k 'vncviewer :1'}}
{{Note|text=If your OS installer is changing the screen resolution, your ''vncviewer'' will be closed, you'll just have to relaunch the command to get the screen back}}
{{Note|text=If your OS installer is changing the screen resolution, your ''vncviewer'' will be closed, you'll just have to relaunch the command to get the screen back}}
{{Warning|text=Installation process IMPORTANT instructions:
{{Warning|text=Installation process IMPORTANT instructions:
*You MUST install your system and it's root on a single partition: <code class="file">/dev/sda3</code><br>
*You MUST not format the <code class="file">/dev/sda</code> disk. When you are asked about partitioning the disk, select ''manual'' (and not ''guided'').
*The system size is limited to Grid'5000's deployment partition default size (16GiB)<br>
*You MUST install your system on <code class="file">/dev/sda3</code>. Create the partition, if it does not exist. The mount point of this partition should be "/"<br>
*The system size of this partition should be limited to Grid'5000's deployment partition default size (16GiB)<br>
*You must install an SSH server<br>
*You must install an SSH server<br>
*A bootloader should be installed on the partition <code class="file">/dev/sda3</code> and not on the MBR<br>}}
*A bootloader should be installed on the partition <code class="file">/dev/sda3</code> and not on the MBR<br>}}
Line 233: Line 209:
{{Note|text=after the installation process, the virtual machine will fail to boot, it's normal. You can close ''vncviewer'' and ''kvm''}}
{{Note|text=after the installation process, the virtual machine will fail to boot, it's normal. You can close ''vncviewer'' and ''kvm''}}


== Create a Kadeploy3 image of the filesystem of our OS ==
== Customize the OS ==


* Create the mounting point directory
* Create the mounting point directory
Line 241: Line 217:
{{Term|location=frontend|cmd=<code class="command">ssh</code> root@<code class="replace">NODE</code> '<code class="command>mount</code> <code class="file">/dev/sda3 /mnt/myos</code>'}}
{{Term|location=frontend|cmd=<code class="command">ssh</code> root@<code class="replace">NODE</code> '<code class="command>mount</code> <code class="file">/dev/sda3 /mnt/myos</code>'}}
{{Note|text=If this command fails, you can try to use this command first: ''partprobe /dev/sda''}}
{{Note|text=If this command fails, you can try to use this command first: ''partprobe /dev/sda''}}
* If you want to add a custom script that dynamically setup your node's hostname, you can take a look at /etc/dhcp/dhclient-enter-hooks.d/ in a reference environment.
* If you want to :
{{Warning|text=You have to disable selinux (example for Centos, put "SELINUX=disable" in /mnt/myos/etc/selinux/config)}}
** add a custom script that dynamically setup your node's hostname, you can take a look at /mnt/myos/etc/dhcp/dhclient-enter-hooks.d/.
{{Note|text=You can put internal dns in /etc/resolv.conf <br>
** disable selinux: example for Centos, put "SELINUX=disable" in /mnt/myos/etc/selinux/config
You can add proxy for update packages: you can take a look at [[Web_proxy_client]]}}
** modifying the DNS by editing  /etc/resolv.conf
* Save the filesystem in a tgz archive. tgz-g5k documentation is available [[TGZ-G5K|here]]
* Unmount the partition
{{Term|location=frontend|cmd=<code class="command">ssh</code> root@<code class="replace">NODE</code> '<code class="command">tgz-g5k</code> --root /mnt/myos' > <code class="replace">IMAGE_FILE</code>.tgz}}
{{Term|location=frontend|cmd=<code class="command">ssh</code> root@<code class="replace">NODE</code> '<code class="command>umount</code> <code class="file">/mnt/myos</code>'}}
 
== Create a Kadeploy3 image of the filesystem of our OS ==
 
* Save the filesystem in a tgz archive with tgz-g5k
{{Term|location=frontend|cmd=<code class="command">tgz-g5k</code> -m <code class="replace">NODE</code> -r /dev/sda3 -f <code class="replace">IMAGE_FILE</code>.tgz}}
<!--*FIXME (<b>released in tgz-g5k 1.0.9</b>):
<!--*FIXME (<b>released in tgz-g5k 1.0.9</b>):
{{Term|location=frontend|cmd=<code class="command">ssh</code> root@<code class="replace">NODE</code> '<code class="command">wget</code> -q -O - <nowiki>http://public.nancy.grid5000.fr/~lsarzyniec/kaimg/tgz-g5k | bash -s - --root /mnt/myos</nowiki>' > <code class="replace">IMAGE_FILE</code>.tgz}}
{{Term|location=frontend|cmd=<code class="command">ssh</code> root@<code class="replace">NODE</code> '<code class="command">wget</code> -q -O - <nowiki>http://public.nancy.grid5000.fr/~lsarzyniec/kaimg/tgz-g5k | bash -s - --root /mnt/myos</nowiki>' > <code class="replace">IMAGE_FILE</code>.tgz}}
-->
-->
* Create an environment description file such as:
* Create an environment description file <code class="replace">IMAGE_DESC_FILE</code> such as:
<pre>
<pre>
---
---
Line 265: Line 246:
   file: /path/to/IMAGE_FILE.tgz
   file: /path/to/IMAGE_FILE.tgz
postinstalls:  
postinstalls:  
- script: traitement.ash /rambin
- archive: server:///grid5000/postinstalls/g5k-postinstall.tgz
  archive: server:///grid5000/postinstalls/debian-x64-min-1.0-post.tgz
   compression: gzip
   compression: gzip
  script: g5k-postinstall --net debian
boot:  
boot:  
   kernel: /path/to/the/kernel/in/the/image
   kernel: /path/to/the/kernel/in/the/image
   initrd: /path/to/the/initrd/in/the/image
   initrd: /path/to/the/initrd/in/the/image
multipart: false
multipart: false
filesystem: ext3
filesystem: ext4
partition_type: 0x83
partition_type: 0x83
</pre>
</pre>
Line 278: Line 259:
{{Note|text=You can use Grid'5000 files for the postinstall}}
{{Note|text=You can use Grid'5000 files for the postinstall}}
{{Note|
{{Note|
text=For linux systems, most of times the path to the kernel file is <code class="file">/vmlinuz</code> and the path to the initrd is <code class="file">/initrd.img</code>. You can locate those files by connecting to <code class="replace">NODE</code> and checking the <code class="file">/mnt/myos</code> directory.}}
text=For linux systems, most of times the path to the kernel file is <code class="file">/vmlinuz</code> and the path to the initrd is <code class="file">/initrd.img</code>. You can locate those files by connecting to <code class="replace">NODE</code> (like in the previous section using <code class="command">mount</code>) and checking the <code class="file">/mnt/myos</code> directory.}}


* Test it !
* Test it !
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -a <code class="replace">IMAGE_DESC_FILE</code> -m <code class="replace">NODE</code> -k}}
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -a <code class="replace">IMAGE_DESC_FILE</code> -m <code class="replace">NODE</code> -k}}
* Add it to kadeploy (so that you can use the parameter -e <code class="replace">IMAGE_NAME</code> like the default g5k environments)
{{Term|location=frontend|cmd=<code class="command">kaenv3</code> -a <code class="replace">IMAGE_DESC_FILE</code>}}


= Use disk(s) as I want =
= Use disk(s) as I want =
Line 292: Line 277:
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -t deploy -t destructive -l nodes=1,walltime=1 -p "cluster='hercule'" -I}}
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -t deploy -t destructive -l nodes=1,walltime=1 -p "cluster='hercule'" -I}}
Then you can deploy on sda2 or sda5 with the <code class="command">-p [2,5]</code> option:
Then you can deploy on sda2 or sda5 with the <code class="command">-p [2,5]</code> option:
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e wheezy-x64-nfs -f $OAR_NODEFILE -p <code class="replace">2</code> -k}}
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e debian10-x64-nfs -f $OAR_NODEFILE -p <code class="replace">2</code> -k}}


== Deploy on additional disks ==
== Deploy on additional disks ==
{{Warning|text=Currently broken}}
First, as this kind of deployment will break node standard operation, you must tell to OAR that it should be redeployed entirely after the reservation with the <code class="command">-t destructive</code> option:
First, as this kind of deployment will break node standard operation, you must tell to OAR that it should be redeployed entirely after the reservation with the <code class="command">-t destructive</code> option:
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -t deploy -t destructive -l nodes=1,walltime=1 -p "cluster='hercule'" -I}}
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -t deploy -t destructive -l nodes=1,walltime=1 -p "cluster='hercule'" -I}}
Then you can deploy on an additional disk such as sdb with the <code class="command">-b sdb</code> option:
Then you can deploy on an additional disk such as sdb with the <code class="command">-b sdb</code> option:
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e wheezy-x64-base -f $OAR_NODEFILE -b <code class="replace">sdb</code> -k}}
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e debian10-x64-base -f $OAR_NODEFILE -b <code class="replace">sdb</code> -k}}


== Format additional disks ==
== Format additional disks ==
In Kadeploy3, we can easily customize the deployment's automata. It's possible to add custom ''pre, post or substitute operations'' to each steps. In a custom operation it's possible to: ''send'' a file, ''execute'' a command or ''run'' a script.
In Kadeploy3, we can easily customize the deployment's automata. It's possible to add custom ''pre, post or substitute operations'' to each steps. In a custom operation it's possible to: ''send'' a file, ''execute'' a command or ''run'' a script.


This feature in explained in Kadeploy3's documentation (available on [https://gforge.inria.fr/frs/?group_id=2026 Kadeploy3's website]) in the section ''4.2.2, Use Case 10'' and ''4.7''.
This feature is explained in Kadeploy3's documentation (available on [https://gforge.inria.fr/frs/?group_id=2026 Kadeploy3's website]) in the section ''4.2.2, Use Case 10'' and ''4.7''.


In this example, we will add some custom operations to the deployment workflow: our nodes have two additional hard disks and we want them to be formated during the deployment process.
In this example, we will add some custom operations to the deployment workflow: our nodes have two additional hard disks and we want them to be formated during the deployment process.
Line 384: Line 371:


Now you can deploy you environment with this custom operation:
Now you can deploy you environment with this custom operation:
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e wheezy-x64-min -f $OAR_NODE_FILE -k --custom-steps ./custom-partitioning.yml}}
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e debian10-x64-min -f $OAR_NODE_FILE -k --custom-steps ./custom-partitioning.yml}}
{{Warning|text=In some case you should increase the step's timeout (for some long formatting for example) see [[Advanced_Kadeploy#Adjusting timeout for some environments]] for details.}}
{{Warning|text=In some cases you should increase the step's timeout (for some long formatting for example) see [[Advanced_Kadeploy#Adjusting timeout for some environments]] for details.}}


'''Note:''' Both partitions are not mounted on boot. To mount those partitions you should do:
'''Note:''' Both partitions are not mounted on boot. To mount those partitions you should do:
Line 394: Line 381:


== Use a custom partitioning scheme ==
== Use a custom partitioning scheme ==
In Kadeploy3, we can easily customize the deployment's automata. It's possible to add custom ''pre, post or substitute operations'' to each steps. In a custom operation it's possible to: ''send'' a file, ''execute'' a command or ''run'' a script.
=== Example 1 : Deploy on the whole disk ===
In Kadeploy3, we can customize the deployment's automata. It's possible to add custom ''pre, post or substitute operations'' to each steps. In a custom operation it's possible to: ''send'' a file, ''execute'' a command or ''run'' a script.
In this example, we will modify the deployment workflow to deploy the system on a unique disk partition ( '/' on sda1 )
 
'''1. Make the reservation in destructive mode'''
 
As you will change partitioning of the disk, you must tell to OAR that it should redeploy the node entirely after the reservation with the <code class="command">-t destructive</code> parameter:
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -t deploy -t destructive -l nodes=1,walltime=2 -I}}
 
'''2. Describe the custom operations'''
 
After that you have to create a file that describe the custom operations you want to be performed during the deployment.
In this example we will create our custom partitioning scheme and bypass some steps that are not necessary to deploy the system on a unique partition.
 
* The operation description file (let's say '''custom-partitioning.yml''') should look like something like this:
<pre>
---
# Our custom steps should be performed during the SetDeploymentEnv macro-step
SetDeploymentEnvUntrusted:
  # Custom partitioning step that is substituted to the create_partition_table micro-step
  create_partition_table:
    substitute:
      # We send a file on the node
      - action: send
        file: map.parted
        # The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy
        destination: $KADEPLOY_TMP_DIR
        name: send_partition_map
      # Then we execute the parted command using the previously sent file
      - action: exec
        name: partitioning
        # The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy
        command: parted -a optimal /dev/sda --script $(cat $KADEPLOY_TMP_DIR/map.parted)
# Hack to disable useless steps
  format_tmp_part:
    substitute:
      - action: exec
        name: remove_format_tmp_part_step
        command: /bin/true
  format_swap_part:
    substitute:
      - action: exec
        name: remove_format_swap_part_step
        command: /bin/true
</pre>
 
* The file '''map.parted''', which will be passed to '''parted''', will look like this:
<pre>
mklabel gpt
mkpart partition-system ext4 0% 100%
toggle 1 boot
align-check optimal 1
</pre>
 
'''3. Customize the environment's postinstall.'''
In order for our new partitions to be mounted at boot time we will modify the Grid'5000 postinstall files.
 
* Create and go in your public directory:
{{Term|location=frontend|cmd=mkdir public/custom-postinstall && cd public/custom-postinstall}}
* Then decompress the postinstall archive:
{{Term|location=frontend|cmd=<code class="command">tar</code> xzf <code class="replace">/grid5000/postinstalls/g5k-postinstall.tgz</code>}}
* Add your custom /etc/fstab file in this directory, named '''fstab''':
<pre>
/dev/sda1      /          ext4    defaults 1      2
</pre>
When you will pass "--fstab custom" option to the postinstall, it will copy this file in /etc/fstab
* Regenerate the postinstall archive:
{{Term|location=frontend|cmd=<code class="command">tar</code> -czvf <code class="replace">~/public/g5k-postinstall-custom.tgz</code> *}}
* Make some cleanup:
* Create the environment's description file (let's say '''custom-env.dsc''') based on the reference one:
** use kaenv3 -p debian10-x64-base to have an example of environment description.
Your '''custom-env.dsc''' should look like this :
<pre>
---
name: custom-env
version: 1
description: Custom env based on Debian 10
author: me@domain.tld
visibility: shared
destructive: true
os: linux
image:
  file: server:///grid5000/images/debian10-x64-base-2020012812.tgz
  kind: tar
  compression: gzip
postinstalls:
- archive: http://public/~<login>/g5k-postinstall-custom.tgz
  compression: gzip
  script: g5k-postinstall --net debian --fstab custom
boot:
  kernel: "/vmlinuz"
  initrd: "/initrd.img"
filesystem: ext4
partition_type: 131
multipart: false
</pre>


This feature in explained in Kadeploy3's documentation (available on [https://gforge.inria.fr/frs/?group_id=2026 Kadeploy3's website]) in the section ''4.2.2, Use Case 10'' and ''4.7''.
'''4. Run the deployment'''


Finally, we deploy our custom environment with your custom operations:
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -a custom-env.dsc -f $OAR_NODE_FILE -p 1 -k --custom-steps custom-partitioning.yml}}
{{Note|text=In some case you should increase the step's timeout (for some long formatting for example) see [[Advanced_Kadeploy#Adjusting timeout for some environments]] for details.}}


In this example, we will modify the deployment workflow: a different partition will be used for each of the /, /usr, /var, /tmp and /home directories.
=== Example 2 : Deploy on multiple partitions ===


In Kadeploy3, we can customize the deployment's automata. It's possible to add custom ''pre, post or substitute operations'' to each steps. In a custom operation it's possible to: ''send'' a file, ''execute'' a command or ''run'' a script.
In this example, we will modify the deployment workflow: a different partition will be used for each of the ''/'', ''/home'', ''/opt'' and ''/tmp'' directories.
Imagine that you want to make your own partitioning scheme like that:
Imagine that you want to make your own partitioning scheme like that:
* swap 2G on primary partition sda1
{| class="wikitable"
* root 18G on primary partition sda2
|-
* usr 30G on primary partition sda3
! Mount point !! Partition !! Disk space !! File System
* var 20G on extended partition sda5
|-
* home 20G on extended partition sda6
| swap || /dev/sda1 || 2G || linux-swap
* tmp ''everything else'' on extended partition sda7
|-
 
| / || /dev/sda2 || 18G || ext4
|-
| /var || /dev/sda3 || 30G || ext4
|-
| /opt || /dev/sda4 || 20G || ext4
|-
| /tmp || /dev/sda5 || ''everything else'' || ext4
|}


The four following sections describe how to perform such an operation.
The four following sections describe how to perform such an operation.
Line 426: Line 520:
# Our custom steps should be performed during the SetDeploymentEnv macro-step
# Our custom steps should be performed during the SetDeploymentEnv macro-step
SetDeploymentEnvUntrusted:
SetDeploymentEnvUntrusted:
   # Custom partitioning step that is substitued to the create_partition_table micro-step
   # Custom partitioning step that is substituted to the create_partition_table micro-step
   create_partition_table:
   create_partition_table:
     substitute:
     substitute:
Line 432: Line 526:
       - action: send
       - action: send
         file: map.parted
         file: map.parted
         # The variable $KADEPLOY_TMP_DIR will be substitued by kadeploy
         # The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy
         destination: $KADEPLOY_TMP_DIR  
         destination: $KADEPLOY_TMP_DIR  
         name: send_partition_map
         name: send_partition_map
Line 438: Line 532:
       - action: exec
       - action: exec
         name: partitioning
         name: partitioning
         # The variable $KADEPLOY_TMP_DIR will be substitued by kadeploy
         # The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy
         command: parted -a optimal /dev/sda --script $(cat $KADEPLOY_TMP_DIR/map.parted)
         command: parted -a optimal /dev/sda --script $(cat $KADEPLOY_TMP_DIR/map.parted)
   # Custom format step, done after the format_deploy_part micro-step
   # Custom format step, done after the format_deploy_part micro-step
Line 447: Line 541:
         name: format_partitions
         name: format_partitions
         file: format.sh
         file: format.sh
   # Custom mount step, done after thefmount_deploy_part micro-step
   # Custom mount step, done after the mount_deploy_part micro-step
   mount_deploy_part:
   mount_deploy_part:
     post-ops:
     post-ops:
Line 469: Line 563:
* The file '''map.parted''' will look like something like this:
* The file '''map.parted''' will look like something like this:
<pre>
<pre>
mklabel msdos
mklabel gpt
u GB mkpart primary linux-swap 0% 2
u GB mkpart partition-swap linux-swap 0% 2
u GB mkpart primary ext4 2 20
u GB mkpart partition-system ext4 2 20
u GB mkpart primary ext4 20 50
u GB mkpart partition-var ext4 20 50
u GB mkpart extended 50 100%
u GB mkpart partition-opt ext4 50 70
u GB mkpart logical ext4 50 70
u GB mkpart partition-tmp ext4 70 100%
u GB mkpart logical ext4 70 90
u GB mkpart logical ext4 90 100%
toggle 2 boot
toggle 2 boot
align-check optimal 1
align-check optimal 1
Line 483: Line 575:
align-check optimal 4
align-check optimal 4
align-check optimal 5
align-check optimal 5
align-check optimal 6
align-check optimal 7
</pre>
</pre>
* The file '''format.sh''' will look like something like this:
* The file '''format.sh''' will look like something like this:
<pre>
<pre>
Line 492: Line 583:


mkfs_opts="sparse_super,filetype,resize_inode,dir_index"
mkfs_opts="sparse_super,filetype,resize_inode,dir_index"
ext4_blocksize="4096"


# create swap
mkswap ${KADEPLOY_BLOCK_DEVICE}1
mkswap ${KADEPLOY_BLOCK_DEVICE}1
# / will be formated by Kadeploy since we will precise the -p 2 option
# / will be formated by Kadeploy since we will precise the -p 2 option
# formating /usr/
# formating /var
mkfs -t ext4 -b 4096 -O $mkfs_opts -q ${KADEPLOY_BLOCK_DEVICE}3
mkfs -t ext4 -b ${ext4_blocksize} -O ${mkfs_opts} -q ${KADEPLOY_BLOCK_DEVICE}3
# formating /var/
# formating /opt
mkfs -t ext4 -b 4096 -O $mkfs_opts -q ${KADEPLOY_BLOCK_DEVICE}5
mkfs -t ext4 -b ${ext4_blocksize} -O ${mkfs_opts} -q ${KADEPLOY_BLOCK_DEVICE}4
# formating /home/
# formating /tmp
mkfs -t ext4 -b 4096 -O $mkfs_opts -q ${KADEPLOY_BLOCK_DEVICE}6
mkfs -t ext4 -b ${ext4_blocksize} -O ${mkfs_opts} -q ${KADEPLOY_BLOCK_DEVICE}5
# formating /tmp/
mkfs -t ext4 -b 4096 -O $mkfs_opts -q ${KADEPLOY_BLOCK_DEVICE}7
</pre>
</pre>
{{Note|text=When running a custom script, Kadeploy will export different variables, you can get a list of them by running "kadeploy -i".<br>A description of each of this variables is available in Kadeploy3's documentation ([https://gforge.inria.fr/frs/?group_id=2026 on Kadeploy3 website]) in the section ''4.4''}}
{{Note|text=When running a custom script, Kadeploy will export different variables, you can get a list of them by running "kadeploy -i".}}
* The file '''mount.sh''' will look like something like this:
* The file '''mount.sh''' will look like something like this:
<pre>
<pre>
Line 511: Line 602:


# / will be mounted in ${KADEPLOY_ENV_EXTRACTION_DIR} by Kadeploy
# / will be mounted in ${KADEPLOY_ENV_EXTRACTION_DIR} by Kadeploy
# mount /usr
mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/usr
mount ${KADEPLOY_BLOCK_DEVICE}3 ${KADEPLOY_ENV_EXTRACTION_DIR}/usr/
# mount /var
# mount /var
mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/var
mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/var
mount ${KADEPLOY_BLOCK_DEVICE}5 ${KADEPLOY_ENV_EXTRACTION_DIR}/var/
mount ${KADEPLOY_BLOCK_DEVICE}3 ${KADEPLOY_ENV_EXTRACTION_DIR}/var/
# mount /home
# mount /opt
mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/home
mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/opt
mount ${KADEPLOY_BLOCK_DEVICE}6 ${KADEPLOY_ENV_EXTRACTION_DIR}/home/
mount ${KADEPLOY_BLOCK_DEVICE}4 ${KADEPLOY_ENV_EXTRACTION_DIR}/opt/
# mount /tmp
# mount /tmp
mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/tmp
mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/tmp
mount ${KADEPLOY_BLOCK_DEVICE}7 ${KADEPLOY_ENV_EXTRACTION_DIR}/tmp/
mount ${KADEPLOY_BLOCK_DEVICE}5 ${KADEPLOY_ENV_EXTRACTION_DIR}/tmp/
</pre>
</pre>


'''3. Customize the environment's postinstall'''
'''3. Customize the environment's postinstall.'''


In order for our new partitions to be mounted at boot time we can modify the Grid'5000 postinstall files (this customization can also be done by adding another custom operation).
In order for our new partitions to be mounted at boot time we can modify the Grid'5000 postinstall files (this customization can also be done by adding another custom operation).


* First of all we need to know where the postinstall of our environment is located (the field postinstalls/[archive]):
* Create and go in a temporary directory:
{{Term|location=frontend|cmd=<code class="command">kaenv3</code> -p <code class="replace">wheezy-x64-base</code> -u deploy}}
* Then we decompress it in a temporary directory:
{{Term|location=frontend|cmd=<code class="command">tmpdir=</code>$(mktemp -d) && <code class="command">export</code> tmpdir && <code class="command">pushd</code> $tmpdir}}
{{Term|location=frontend|cmd=<code class="command">tmpdir=</code>$(mktemp -d) && <code class="command">export</code> tmpdir && <code class="command">pushd</code> $tmpdir}}
{{Term|location=frontend|cmd=<code class="command">tar</code> xzf <code class="replace">/grid5000/postinstalls/debian-x64-base-2.4-post.tgz</code>}}
* Then decompress the postinstall archive:
{{Note|text=We asume that the current shell is BASH, if not please replace the "export" instruction}}
{{Term|location=frontend|cmd=<code class="command">tar</code> xzf <code class="replace">/grid5000/postinstalls/g5k-postinstall.tgz</code>}}
* Modify the /etc/fstab file:
{{Note|text=We assume that the current shell is BASH, if not please replace the "export" instruction}}
{{Term|location=frontend|cmd=<code class="command">cat</code> dest/etc/fstab}}
* Add your custom /etc/fstab file in this temporary directory, named '''fstab''':
<pre>
<pre>
# /etc/fstab: static file system information.
/dev/sda1       none         swap   sw      0     0
#
/dev/sda3      /var         ext4   defaults 1     2
# <file system> <mount point>  <type> <options>  <dump>  <pass>
/dev/sda4       /opt          ext4   defaults 1     2
proc            /proc          proc  defaults  0      0
/dev/sda5       /tmp         ext4   defaults 1     2
sysfs          /sys            sysfs  defaults  0      0
devpts          /dev/pts        devpts gid=5,mode=620 0  0
tmpfs          /dev/shm        tmpfs  defaults  0       0
<[SWAP_PART]>  none           swap   sw         0       0
<[ROOT_PART]>  /              <[ROOT_FSTYPE]>  errors=remount-ro  0  0
/dev/sda3       /usr            ext4  defaults  1      2
/dev/sda5       /var           ext4   defaults   1       2
/dev/sda6       /home          ext4   defaults   1       2
/dev/sda7       /tmp           ext4   defaults   1       2
</pre>
</pre>
/ will be added by Kadeploy since we will precise the <code class="command">-p 2</code> option
* Regenerate the postinstall archive:
* Regenerate the postinstall archive:
{{Term|location=frontend|cmd=<code class="command">tar</code> -czvf <code class="replace">~/custom-post.tgz</code> *}}
{{Term|location=frontend|cmd=<code class="command">tar</code> -czvf <code class="replace">~/g5k-postinstall-custom.tgz</code> *}}
* Make some cleanup:
* Make some cleanup:
{{Term|location=frontend|cmd=<code class="command">popd</code> && <code class="command">rm</code> -R $tmpdir}}
{{Term|location=frontend|cmd=<code class="command">popd</code> && <code class="command">rm</code> -R $tmpdir}}
* Create the environment's description file (let's say '''custom-env.yml''') based on the reference one:
* Create the environment's description file (let's say '''custom-env.yml''') based on the reference one:
{{Term|location=frontend|cmd=<code class="command">kaenv3</code> -p wheezy-x64-base -u deploy <nowiki>|</nowiki> sed -e "s/archive:.*$/archive: <code class="replace">\/home\/${USER}\/custom-post.tgz</code>/" -e 's/public/shared/' > custom-env.yml}}
{{Term|location=frontend|cmd=<code class="command">kaenv3</code> -p debian10-x64-base -u deploy <nowiki>|</nowiki> sed -e "s/archive:.*$/archive: <code class="replace">\/home\/${USER}\/g5k-postinstall-custom.tgz</code>/" -e 's/public/shared/' > custom-env.yml}}
or
and customize the '''custom-env.yml''' file to suit your needs (especially your archive path):
{{Term|location=frontend|cmd=<code class="command">kaenv3</code> -p wheezy-x64-base -u deploy > custom-env.yml}}
{{Term|location=frontend|cmd=<code class="command">cat</code> custom-env.yml}}
<pre>
<pre>
---  
---  
name: custom-env
name: custom-env
version: 1
version: 1
description: Custom env based on: Debian 7. https://www.grid5000.fr/mediawiki/index.php/Wheezy-x64-base-1.2
description: Custom env based on Debian 10
author: me@domain.tld
author: me@domain.tld
visibility: shared
visibility: shared
destructive: false
destructive: true
os: linux
os: linux
image:  
image:
  file: server:///grid5000/images/debian10-x64-base-2019100414.tgz
   kind: tar
   kind: tar
   compression: gzip
   compression: gzip
  file: server:///grid5000/images/wheezy-x64-base-1.2.tgz
postinstalls:
postinstalls:  
- archive: /home/me/g5k-postinstall-custom.tgz
- script: traitement.ash /rambin
  archive: /home/me/custom-post.tgz
   compression: gzip
   compression: gzip
boot:  
  script: g5k-postinstall --net debian --fstab custom
   kernel: /vmlinuz
boot:
   initrd: /initrd.img
   kernel: "/vmlinuz"
multipart: false
   initrd: "/initrd.img"
filesystem: ext4
filesystem: ext4
partition_type: 131
partition_type: 131
multipart: false
</pre>
</pre>
{{Warning|text=Do not forget the <code class="command">--fstab custom</code> option to g5k-postinstall.}}


'''4. Run the deployment'''
'''4. Run the deployment'''


Finally, we deploy our custom environment with your custom operations:
Finally, we deploy our custom environment with your custom operations:
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -a custom-env.yml -f $OAR_NODE_FILE -p 2 -k --set-custom-operations custom-partitioning.yml}}
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -a custom-env.yml -f $OAR_NODE_FILE -p 2 -k --custom-steps custom-partitioning.yml}}
{{Warning|text=In some case you should increase the step's timeout (for some long formatting for example) see [[Advanced_Kadeploy#Adjusting timeout for some environments]] for details.}}
{{Note|text=In some case you should increase the step's timeout (for some long formatting for example) see [[Advanced_Kadeploy#Adjusting timeout for some environments]] for details.}}


= Tuning the Kadeploy3 deployment workflow =
= Tuning the Kadeploy3 deployment workflow =
Line 604: Line 682:
In Grid'5000, we use the following steps by default in all our clusters :
In Grid'5000, we use the following steps by default in all our clusters :
* <code class="env">SetDeploymentEnv</code> -> <code class=file>SetDeploymentEnvUntrusted</code> : use an embedded deployment environment
* <code class="env">SetDeploymentEnv</code> -> <code class=file>SetDeploymentEnvUntrusted</code> : use an embedded deployment environment
* <code class="env">BroadcastEnv</code> -> <code class=file>BroadcastEnvKastafior</code> : use the Kastafior tool to broadcast the environment
* <code class="env">BroadcastEnv</code> -> <code class=file>BroadcastEnvKascade</code> : use the Kascade tool to broadcast the environment
* <code class="env">BootNewEnv</code> -> <code class=file>BootNewEnvKexec</code> : the nodes use kexec to reboot (if it fails, a <code class=file>BootNewEnvClassical</code>, classical reboot, will be performed)
* <code class="env">BootNewEnv</code> -> <code class=file>BootNewEnvKexec</code> : the nodes use kexec to reboot (if it fails, a <code class=file>BootNewEnvClassical</code>, classical reboot, will be performed)


Each one of these implementations is divided in micro steps. You can can see the name of those micro-steps if you use the kadeploy3 option <code>--verbose-level 4</code>. And to see what is actually executed during those micro-steps you can add the debug option of kadeploy3 <code>-d</code>
Each one of these implementations is divided in micro steps. You can can see the name of those micro-steps if you use the kadeploy3 option <code>--verbose-level 4</code>. And to see what is actually executed during those micro-steps you can add the debug option of kadeploy3 <code>-d</code>


{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -f <code class=file>$OAR_FILE_NODES</code> -k -e wheezy-x64-base --verbose-level 4 -d  &#62; <code class=file>~/kadeploy3_steps</code>}}
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -f <code class=file>$OAR_FILE_NODES</code> -k -e debian10-x64-base --verbose-level 4 -d  &#62; <code class=file>~/kadeploy3_steps</code>}}


This command will store the kadeploy3 standard output in the file <code class=file>~/kadeploy3_steps</code>. Lets analyse its content:  
This command will store the kadeploy3 standard output in the file <code class=file>~/kadeploy3_steps</code>. Lets analyse its content:  
Line 625: Line 703:
# <code class=file>SetDeploymentEnvUntrusted</code>-<code class="replace">format_tmp_part</code>: Format the partition defined as tmp (by default, /dev/sda5)
# <code class=file>SetDeploymentEnvUntrusted</code>-<code class="replace">format_tmp_part</code>: Format the partition defined as tmp (by default, /dev/sda5)
# <code class=file>SetDeploymentEnvUntrusted</code>-<code class="replace">format_swap_part</code>: Format the swap partition
# <code class=file>SetDeploymentEnvUntrusted</code>-<code class="replace">format_swap_part</code>: Format the swap partition
# <code class=file>BroadcastEnvKastafior</code>-<code class="replace">send_environment</code>: Sends your environments into the node and untar it into the deployment partition.
# <code class=file>BroadcastEnvKascade</code>-<code class="replace">send_environment</code>: Sends your environments into the node and untar it into the deployment partition.
# <code class=file>BroadcastEnvKastafior</code>-<code class="replace">manage_admin_post_install</code>: Execute post installation instructions defined by the site admins, in general to adapt to the specificities of the cluster: console baud rate, Infiniband, Myrinet, proxy address,...
# <code class=file>BroadcastEnvKascade</code>-<code class="replace">manage_admin_post_install</code>: Execute post installation instructions defined by the site admins, in general to adapt to the specificities of the cluster: console baud rate, Infiniband,...
# <code class=file>BroadcastEnvKastafior</code>-<code class="replace">manage_user_post_install</code>: Execute user defined post installation instructions to automatically configure its node depending on its cluster, site, network capabilities, disk capabilities,...
# <code class=file>BroadcastEnvKascade</code>-<code class="replace">manage_user_post_install</code>: Execute user defined post installation instructions to automatically configure its node depending on its cluster, site, network capabilities, disk capabilities,...
# <code class=file>BroadcastEnvKastafior</code>-<code class="replace">send_key</code>: Sends the user public ssh key(s) to the node (if the user specified it with the option <code>-k</code>).
# <code class=file>BroadcastEnvKascade</code>-<code class="replace">send_key</code>: Sends the user public ssh key(s) to the node (if the user specified it with the option <code>-k</code>).
# <code class=file>BroadcastEnvKastafior</code>-<code class="replace">install_bootloader</code>: Properly configures the bootloader
# <code class=file>BroadcastEnvKascade</code>-<code class="replace">install_bootloader</code>: Properly configures the bootloader
# <code class=file>BootNewEnvKexec</code>-<code class="replace">switch_pxe</code>: Configure the PXE server so that this node will boot on the partition where your environment has been installed
# <code class=file>BootNewEnvKexec</code>-<code class="replace">switch_pxe</code>: Configure the PXE server so that this node will boot on the partition where your environment has been installed
# <code class=file>BootNewEnvKexec</code>-<code class="replace">umount_deploy_part</code> : Umount the deployment partition from the directory where it has been mounted during the step 7.
# <code class=file>BootNewEnvKexec</code>-<code class="replace">umount_deploy_part</code> : Umount the deployment partition from the directory where it has been mounted during the step 7.
Line 638: Line 716:


That is it. You now know all the default micro-steps used to deploy your environments.
That is it. You now know all the default micro-steps used to deploy your environments.
{{Note|text=It is recommended to consult the [[Node storage]] page to understand which partition is used at which step.}}
{{Note|text=It is recommended to consult the [[Grid5000:Node storage]] page to understand which partition is used at which step.}}


== Adjusting timeout for some environments ==
== Adjusting timeout for some environments ==
Line 645: Line 723:
All defaults timeouts are entered in the configurations files on the kadeploy3 server. But you can consult the default timeouts of each macro-steps by using the command <code class="command">kastat3</code>
All defaults timeouts are entered in the configurations files on the kadeploy3 server. But you can consult the default timeouts of each macro-steps by using the command <code class="command">kastat3</code>


{{Term|location=frontend|cmd=<code class="command">kastat3</code> --last -f hostname -f step1 -f step1_duration -f step2 -f step2_duration -f step3 -f step3_duration}}
{{Term|location=frontend|cmd=<code class="command">kastat3</code> -I}}
   griffon-1.nancy.grid5000.fr,SetDeploymentEnvUntrusted,143,BroadcastEnvKastafior,111,BootNewEnvKexec,33
   Kadeploy server configuration:
  griffon-10.nancy.grid5000.fr,SetDeploymentEnvUntrusted,143,BroadcastEnvKastafior,111,BootNewEnvKexec,33
  Custom PXE boot method: PXElinux
  Automata configuration:
    hercule:
      SetDeploymentEnv: SetDeploymentEnvUntrusted,1,600
      BroadcastEnv: BroadcastEnvKascade,0,1000
      BootNewEnv: BootNewEnvKexec,0,180; BootNewEnvHardReboot,0,900
    nova:
      SetDeploymentEnv: SetDeploymentEnvUntrusted,1,600
      BroadcastEnv: BroadcastEnvKascade,0,1000
      BootNewEnv: BootNewEnvKexec,0,150; BootNewEnvHardReboot,0,600
   ...
   ...


This command will simply print information of the '''''last''''' deployment made on each nodes of that site. The format of the output is the following:
  hostname,step1,step1_duration,step2 ,step2_duration,step3,step3_duration


Nevertheless, <code class="command">kadeploy3</code> allow users to change timeouts in the command line. In some cases, when you try to deploy an environment with a large tarball or with a post-install that lasts too long, you may get discarded nodes. This false positive behavior can be avoided by manually modifying the timeouts for each step at the deployment time.
<code class="command">kadeploy3</code> allow users to change timeouts in the command line. In some cases, when you try to deploy an environment with a large tarball or with a post-install that lasts too long, you may get discarded nodes. This false positive behavior can be avoided by manually modifying the timeouts for each step at the deployment time.


For instance, in our previous example, the timeout of each steps are:
For instance, in our previous example, the timeout of each steps are:
* <code class=file>SetDeploymentEnvUntrusted</code>: 143
* <code class=file>SetDeploymentEnvUntrusted</code>: 143
* <code class=file>BroadcastEnvKastafior</code>: 111
* <code class=file>BroadcastEnvKascade</code>: 111
* <code class=file>BootNewEnvKexec</code>: 33
* <code class=file>BootNewEnvKexec</code>: 33


You can increase the timeout of the second step to 1200 seconds with the following command :  
You can increase the timeout of the second step to 1200 seconds with the following command :  
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e my_big_env -f <code class="env">$OAR_FILE_NODES</code> -k --force-steps "SetDeploymentEnv&#124;SetDeploymentEnvUntrusted:1:450&BroadcastEnv&#124;BroadcastEnvKastafior:1:1200&BootNewEnv&#124;BootNewEnvClassical:1:400"}}
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e my_big_env -f <code class="env">$OAR_FILE_NODES</code> -k --force-steps "SetDeploymentEnv&#124;SetDeploymentEnvUntrusted:1:450&BroadcastEnv&#124;BroadcastEnvKascade:1:1200&BootNewEnv&#124;BootNewEnvClassical:1:400"}}


== Set Break-Point during deployment ==
== Set Break-Point during deployment ==
Line 667: Line 752:
Moreover, <code class="command">kadeploy3</code> allows user to set a break-point during deployment.  
Moreover, <code class="command">kadeploy3</code> allows user to set a break-point during deployment.  


{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -f <code class="env">$OAR_FILE_NODES</code> -k -e wheezy-x64-base --verbose-level 4  -d --breakpoint <code class=file>BroadcastEnvKastafior</code>:<code class="replace">manage_user_post_install</code>}}
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -f <code class="env">$OAR_FILE_NODES</code> -k -e debian10-x64-base --verbose-level 4  -d --breakpoint <code class=file>BroadcastEnvKascade</code>:<code class="replace">manage_user_post_install</code>}}


This command can be used for debugging purpose. It performs a deployment with the maximum verbose level and it asks to stop the deployment workflow just '''''before''''' executing the ''manage_user_post_install''  micro-step of the ''BroadcastEnvKastafior''  macro-step. Thus you will be able to connect in the deployment environment and to manually run the user post install script to debug it.
This command can be used for debugging purpose. It performs a deployment with the maximum verbose level and it asks to stop the deployment workflow just '''''before''''' executing the ''manage_user_post_install''  micro-step of the ''BroadcastEnvKascade''  macro-step. Thus you will be able to connect in the deployment environment and to manually run the user post install script to debug it.


{{Warning|text=At the current state of <code class="command">kadeploy3</code>, it is not possible to resume the deployment from the break-point step. Thus you will have to redeploy you environment from the first step. This feature will be implemented in future version of <code class="command">kadeploy3</code>.}}
{{Warning|text=At the current state of <code class="command">kadeploy3</code>, it is not possible to resume the deployment from the break-point step. Thus you will have to redeploy you environment from the first step. This feature will be implemented in future version of <code class="command">kadeploy3</code>.}}
Line 682: Line 767:
* Modify the workflow: [[Advanced_Kadeploy#Use a custom partitioning scheme]]
* Modify the workflow: [[Advanced_Kadeploy#Use a custom partitioning scheme]]


{{Note|text=When running a custom script, Kadeploy will export different variables, you can get a list of them by running "kadeploy -i".<br>A description of each of this variables is available in Kadeploy3's documentation ([https://gforge.inria.fr/frs/?group_id=2026 on Kadeploy3 website]) in the section ''4.4''}}
{{Note|text=When running a custom script, Kadeploy will export different variables, you can get a list of them by running <code class="command">kadeploy3 -I</code>.<br>A description of each of this variables is available in Kadeploy3's documentation ([https://gforge.inria.fr/frs/?group_id=2026 on Kadeploy3 website]) in the section ''4.4''}}
 
= Customizing the postinstalls =
In Kadeploy3, postinstalls are scripts that are executed after the copy of the image file in order to customize site-specific or cluster-specific aspects.
Since the beginning on 2018, on Grid'5000 the same postinstall script (called g5k-postinstall) is used for all reference environments (and is thus compatible with all supported Debian versions and distributions). That script takes parameters in order to define its behaviour (for example, to choose the style of network configuration to use).
 
The source code for g5k-postinstall is [https://gitlab.inria.fr/grid5000/g5k-postinstall/tree/master/g5k-postinstall available on gitlab.inria.fr]. Its parameters at the time of writing are:
Usage: kadeploy3 [options]
Contact: kadeploy3-users@lists.gforge.inria.fr
Generic options:
  -h, --help                  Show this message
  -v, --version                Get the server's version
  -I, --server-info            Get information about the server's configuration
  -M, --multi-server          Activate the multi-server mode
  -H, --[no-]debug-http        Debug HTTP communications with the server (can be redirected to the fd #4)
  -S, --server STRING          Specify the Kadeploy server to use
      --[no-]dry-run          Perform a dry run
      --password [PASSWORD]    Provide a password for HTTP Basic authentication (read from STDIN if not
                              specified)
  -d, --[no-]debug-mode        Activate the debug mode (can be redirected to the fd #3)
  -f, --file MACHINELIST      A file containing a list of nodes (- means stdin)
  -m, --machine MACHINE        The node to run on
  -n FILENAME,                File that will contain the nodes on which the operation has not been
                              correctly performed
      --output-ko-nodes
  -o FILENAME,                File that will contain the nodes on which the operation has been correctly
                              performed
      --output-ok-nodes
  -s, --script FILE            Execute a script at the end of the operation
  -V, --verbose-level VALUE    Verbose level (between 0 to 5)
      --write-workflow-id FILE Write the workflow id in a file
      --[no-]wait              Wait the end of the operation, set by default
      --[no-]force            Allow to deploy even on the nodes tagged as currently used (use this only
                              if you know what you do)
      --breakpoint STEP        Set a breakpoint just before lauching the given micro-step, the syntax is
                              macrostep:microstep (use this only if you know what you do)
      --custom-steps FILE      Add some custom operations defined in a file
      --[no-]hook              Launch server's hook at the end of operation, disabled by default
General options:
  -a, --env-file ENVFILE      File containing the environment description
  -b BLOCKDEVICE,              Specify the block device to use
      --block-device
  -c, --boot-partition NUMBER  Specify the number of the partition to boot on (use 0 to boot on the MBR)
  -e, --env-name ENVNAME      Name of the recorded environment
  -k, --key [FILE]            Public key to copy in the root's authorized_keys, if no argument is
                              specified, use ~/.ssh/authorized_keys
  -p NUMBER,                  Specify the partition number to use
      --partition-number
  -r, --reformat-tmp FSTYPE    Reformat the /tmp partition with the given filesystem type (this
                              filesystem need to be supported by the deployment environment)
  -u, --env-user USERNAME      Specify the user that own the recorded environment
      --vlan VLANID            Set the VLAN
  -w, --set-pxe-profile FILE  Set the PXE profile (use with caution)
      --set-pxe-pattern FILE  Specify a file containing the substitution of a pattern for each node in
                              the PXE profile (the NODE_SINGULARITY pattern must be used in the PXE
                              profile)
  -x, --upload-pxe-files FILES Upload a list of files (file1,file2,file3) to the PXE repository. Those
                              files will then be available with the prefix FILES_PREFIX--
      --env-version NUMBER    Version number of the recorded environment
Advanced options:
      --no-kexec              Disable kexec reboots during the deployment process
      --disable-bootloader-install
                              Disable the automatic installation of a bootloader for a Linux based
                              environnment
      --disable-disk-partitioning
                              Disable the disk partitioning
      --reboot-classical-timeout VALUE
                              Overload the default timeout for classical reboots (a ruby expression can
                              be used, 'n' will be replaced by the number of nodes)
      --reboot-kexec-timeout VALUE
                              Overload the default timeout for kexec reboots (a ruby expression can be
                              used, 'n' will be replaced by the number of nodes)
      --force-steps STRING    Undocumented, for administration purpose only
      --[no-]secure            Use a secure connection to export files to the server
 
An example environment description using g5k-postinstall is:
$ kaenv3 -p debian10-x64-min
---
name: debian10-x64-min
version: 2019100414
description: debian 10 (buster) - min
author: support-staff@list.grid5000.fr
visibility: public
destructive: false
os: linux
image:
  file: server:///grid5000/images/debian10-x64-min-2019100414.tgz
  kind: tar
  compression: gzip
postinstalls:
- archive: server:///grid5000/postinstalls/g5k-postinstall.tgz
  compression: gzip
  script: g5k-postinstall --net debian
boot:
  kernel: "/vmlinuz"
  initrd: "/initrd.img"
filesystem: ext4
partition_type: 131
multipart: false
 
Things that you can do are:
* Use a customized postinstall script: just point the postinstalls/archive/ field to another tar archive. (see [https://gitlab.inria.fr/grid5000/g5k-postinstall/blob/master/g5k-postinstall/README.md README.md] and [[TechTeam:Postinstalls]] for details)
* Use different parameters to change the behaviour of the postinstall. Example parameters for various situations are:
** Debian min environment with traditional NIC naming: <tt>g5k-postinstall --net debian --net traditional-names</tt>
** Debian min environment with predictable NIC naming: <tt>g5k-postinstall --net debian</tt>
** Debian NFS environment (mount /home, setup LDAP, restrict login to user who reserved the node): <tt>g5k-postinstall --net debian --fstab nfs --restrict-user current</tt>
** Debian big environment (NFS + setup HPC networks and mount site-specific directories): <tt>g5k-postinstall --net debian --net traditional-names --net hpc --fstab nfs --fstab site-specific</tt>
** RHEL/Centos style for network configuration: <tt>g5k-postinstall --net redhat --net traditional-names</tt>
** Ubuntu 1710 or later: NetPlan for network configuration: <tt>g5k-postinstall --net netplan</tt>
** Do not do any network configuration (useful for Gentoo), but force serial console settings: <tt>g5k-postinstall --net=none --inittab='s0:12345:respawn:/sbin/agetty -L SPEED TTYSX vt100'</tt>
 
= Multisite deployment =
 
In order to do a deployment on nodes from differents sites you just have to use the multiserver option of kadeploy : -M
So to do a deployment:
kadeploy -M -f file_with_all_nodes -e debian10-x64-std

Revision as of 07:27, 12 May 2020

Note.png Note

This page is actively maintained by the Grid'5000 team. If you encounter problems, please report them (see the Support page). Additionally, as it is a wiki page, you are free to make minor corrections yourself if needed. If you would like to suggest a more fundamental change, please contact the Grid'5000 team.

Warning.png Warning

Please mind also reading the Environments creation using Kameleon and Puppet tutorial, which gives automated mechanisms to build kadeploy environnements

What you need to know before starting

The first thing to understand is that by using kadeploy3, you will be running a command that attempts to remotely reboot many nodes at the same time, and boot them using configuration files hosted on a server. On some clusters, there is a failure rate associated with this operation that is not null. You might therefore experience failures on some operations during this tutorial. In this case, retry. The system doesn't retry for you as this implies waiting for long timeouts in all cases, even those where a 90% success rate is sufficient.

What is an Environment?

Where we describe what exactly is image, kernel, initrd and postinstall

An environment in kadeploy3 is a set of file describing a fully functional Operating System. To be able to setup a Operating System, kadeploy3 needs at least 4 files in the most common cases

  1. An image
    • An image is a file containing all the Operating System files. It can be a compressed archive (ie tgz file) or a dump of a device (ie dd file). In this tutorial, you will learn to build new images for Kadeploy3
  2. A kernel file
    • For the Unix based environment, the kernel file specifies which kernel to boot. It is the full path of the kernel file.
  3. initrd file (optional)
    • For the Linux based environment, the optional initrd file allows to use an initial ramdisk which will be used as the root filesytem at the boot sequence. More information: Initrd on Wikipedia
  4. A postinstall file (optional)
    • The postinstall file allows you to correctly configure all specificity on each cluster. It is not mandatory to specify it for Kadeploy3 environment but if you know what you are doing, feel free to define it.

Once you have this set of files, you can describe your environment to kadeploy3. This description represents an environment in the kadeploy3 sense.

How can I make my own environment?

To create our own environment there are two main ways. One way is to deploy an existing environment and the other way is to create an environment from scratch from a classical ISO installation. In both situations you can customize and save your environment in order to use it again later.

Search and deploy an existing environment

Search an environment

Grid'5000 maintains several reference environments directly available in all sites. These environments are based on Debian, Ubuntu and Centos.

For Debian, different variants of reference environments are offered. For Ubuntu and Centos, only environment with a minimal system are offered.

They are called reference environments because they can be used to generate customized environments.

The description of the reference environments can be found here
Link.png Getting_Started#Deploying_nodes_with_Kadeploy

An environment registry is maintained in each site (see kaenv3), with the associated filesystem images stored in the /grid5000 directory of the frontend.

To deploy a registered environment, you must know its name as registered in the Kadeploy database. It is the first information on the environment description page. This tutorial uses the debian10-x64-base environment.

You can also list all available environment in a site by using the kaenv3 command :

Terminal.png frontend:
kaenv3 -l

This command lists all public as well as your private environments.

We distinguish three levels of visibility for an environment :

  • public: All users can see those environments. Only administrators can tag them this way.
  • shared: Every users can see the environment provided they use the -u option to specify the user the environment belongs to.
  • private: The environment is only visible by the user the environment belongs to.

For example, a shared environment added by user user is listed this way :

Terminal.png frontend:
kaenv3 -l -u user

You can also look for a specific version with the --env-version option. All the versions of the environments can be found in /grid5000/images. The version number is the last part of the tgz file.

For example : debian10-x64-min-2019100414.tgz => it's the min debian10-x64 reference environment version 2019100414.

Being able to reproduce the experiments that are done is a desirable feature. Therefore, you should always try to control as much as possible the environment the experiment is done in. Therefore, we will attempt to check that the environment that was chosen in the environment directory is the one available on a given cluster. On the cluster you would like to deploy, type the following command to print information about an environment :

Terminal.png frontend:
kaenv3 -p debian10-x64-base -u deploy

You must specify the user option. In our case, all public environments belong to user deploy.

In theory, you should also check the post-install script. A post-install script adapts an environment to the site it is deployed on.

If everything seems ok, please proceed to the next step.

Make a job on a deployable node

By default, Grid'5000 nodes are running on the production environment, which already contains most of the important features and can be used to run experiments. But you will not have administrative privileges (root privileges) on these nodes. So you will not be able to customize these environments at will. In fact, only reference environments can be customized at will. But to have the right to deploy a reference environment on a node, you must supply the option -t deploy when submitting your job.

For this part of the tutorial, job made will be interactive (-I), of the deploy type (-t deploy), on only one machine (-l nodes=1) to do environment customization (we will give ourselves 3 hours with -l walltime=3), which gives us the following command, that will open a new shell session on the frontend node:

Terminal.png frontend:
oarsub -I -t deploy -l nodes=1,walltime=3

Since all Grid'5000 nodes do not necessary have console access, it is recommended in the context of this tutorial to add the option rconsole="YES" to your reservation command.

Terminal.png frontend:
oarsub -I -t deploy -l {"rconsole='YES'"}/nodes=1,walltime=3

Indeed, when you submit a job of the deploy type, a new shell is opened on the frontend node and not on the first machine of the job as for standard jobs. When you exit from this shell, the job ends. The shell is populated with OAR_* environment variables. You should look at the list of available variables to get an idea of the information you can use to script deployment later. As usual, if the job is successfull, you will get the name of the machine allocated to your job with:

Terminal.png frontend:
cat $OAR_FILE_NODES
Warning.png Warning

At the end of a reservation with the options -t deploy, the reserved nodes will be restarted to boot on the standard environment and thus be available to any other user. So you should only use this option -t deploy when you actually intend to deploy a reference environment on the reserved nodes.

Deploy a reference environment

To deploy your environment, you must discover the nodes you were allocated by OAR. The simplest way of doing this is to look at the content of the file whose name is stored in $OAR_FILE_NODES (this variable is labelled $OAR_NODE_FILE too) or the messages displayed when the job was made. This variable $OAR_NODE_FILE simply stores the url of the file containing the FQDN of all your reserved nodes. Deployment happens when you run the following command:

Terminal.png frontend:
kadeploy3 -e debian10-x64-base -m node.site.grid5000.fr

You can automate this to deploy on all nodes of your job with the -f option:

Terminal.png frontend:
kadeploy3 -e debian10-x64-base -f $OAR_FILE_NODES


In order to be able to connect to the node (as root), you must use the -k option and proceed by two ways :

  • You can either specify the public key that will be copied in /root/.ssh/authorized_keys on the deployed nodes :
Terminal.png frontend:
kadeploy3 -e debian10-x64-base -f $OAR_FILE_NODES -k ~/.ssh/my_special_key.pub
  • Or you can supply the -k option without argument. This will automatically copy your ~/.ssh/authorized_keys and replace the /root/.ssh/authorized_keys file on the deployed nodes.
Terminal.png frontend:
kadeploy3 -e debian10-x64-base -f $OAR_FILE_NODES -k

The second case is actually the simplest way. One of its advantages is that after deployments, you will be able to connect directly from your local computer to the deployed nodes, the same way you connect to the frontend of the site were those nodes are.
Once kadeploy has run successfully, the allocated node is deployed under debian10-x64-base environment. It will then be possible to tune this environment according to your needs.

Note.png Note

It is not necessary here, but you can specify destination partition with the -p option. You can find on the Grid5000:Node storage page all informations about the partitions table used on G5K

Connect to the deployed environment and customize it

1. Connection

On reference environments managed by the staff, you can use root account for login through ssh (kadeploy checks that sshd is running before declaring a deployment successful). To connect to the node type :

Terminal.png frontend:
ssh root@node.site.grid5000.fr

In case this doesn't work, please take a look at the kadeploy section of the Sidebar > FAQ

2. Adding software to an environment

Where you learn to install software using the package repository of your distribution on Grid'5000

You can therefore update your environment (to add missing libraries that you need, or remove packages that you don't so that sizes down the image and speeds up the deployment process, etc.) using:

Terminal.png node:
apt-get update
apt-get upgrade
apt-get install list of desired packages and libraries
apt-get --purge remove list of unwanted packages
apt-get clean

Create a new environment from a customized environment

We now need to save this customized environment, where you have a user account, to be able to use this account again each time you deploy it.
The first step to create an environment is to create an archive of the node you just customized. Because of the various implementations of the /dev filesystem tree, this can be a more or less complex operation.

1. Use the provided tools

You can use tgz-g5k to extract a Grid'5000 environment tarball from a running node.

Terminal.png frontend:
tgz-g5k -m node -f ~/path_to_myimage.tgz

This will create a file path_to_image.tgz into your home directory on frontend.

Note.png Note

Please consider the following:

  • If you want to extract a tarball from the Grid'5000 standard environment (i.e., a non-deployed job), you will need to add the option -o to use oarsh/oarcp instead of ssh/scp
  • If you want tgz-g5k to access the node with a custom user id, you can use the option -u myCustomeId (default is root)
  • You can find more information on tgz-g5k (e.g., available options, command line examples) by executing tgz-g5k -h. Some implementation details are also available on the man page (man tgz-g5k).

2. Describe the newly created environment for deployments

Kadeploy3 works using an environment description. The easiest way to create a description for your new environment is to change the description of the environment it is based on. We have based this tutorial on the debian10-x64-base environment of user deploy. We therefore print its description to a file that will be used as a good basis:

Terminal.png frontend:
kaenv3 -p debian10-x64-base -u deploy > mydebian10-x64-base.env

It should be edited to change the name, description, author lines, as well as the tarball line. Since the tarball is local, the path should be a simple absolute path (without a leading server://). The visibility line should be removed, or changed to shared or private. Once this is done, the newly created environment can be deployed using:

Terminal.png frontend:
kadeploy3 -f $OAR_NODEFILE -a mydebian10-x64-base.env

This kind of deployment is called anonymous deployment because the description is not recorded into the Kadeploy3 database. It is particularly useful when you perform the tuning of your environment if you have to update the environment tarball several times.

Once your customized environment is successfully tuned, you can save it to Kadeploy3 database so that you can directly deploy it with kadeploy3, by specifying its name:

Terminal.png frontend:
kaenv3 -a mydebian10-x64-base.env

and then (if your environment is named "mydebian10-base"):

Terminal.png frontend:
kadeploy3 -f $OAR_NODEFILE -e mydebian10-base

With kaenv3 command, you can manage your environments at your ease. Please refer to its documentation for an overview of its features.

Deploy an environment from a classical ISO installation

First of all, this method is OS independant, so you can create Kadeploy3 tgz (linux based systems) or ddgz (other systems) images for any kind of OS from a CD/DVD ISO.

This procedure is based on the usage of virtualization (KVM) mechanism to boot the CD/DVD iso of the OS installer. The system will be installed on a physical partition of a node and then copied as a Kadeploy3 system image.

To be sure the installed system will be bootable, we will make the OS installer install the system on the Grid'5000 deployment partition (sda3).

To make this possible, we will deploy the hypervisor's system on the temporary partition and then install the system on the deployment partition.

Preparation

  • Download your the CD/DVD ISO of your OS installer (say OS_ISO) and upload it on the frontend of the target site help here
Terminal.png local:
scp OS_ISO USERNAME@access.grid5000.fr:SITE/
  • Make a reservation with 1 node, with the deployment mode and the destructive mode (to be able to deploy on the temporary partition), two hours should be enough.
Terminal.png frontend:
oarsub -I -t deploy -t destructive -l nodes=1,walltime=2
  • Deploy a minimal system on the node's temporary partition, use your ssh key (since Grid'5000 is more Debian friendly, let's say Debian 10 Buster)
Terminal.png frontend:
kadeploy3 -e debian10-x64-min -p 5 -k -f $OAR_NODEFILE
  • Connect to the node and install the needed packages
Terminal.png frontend:
ssh root@NODE

Run the CD/DVD OS installer using KVM/VNC

  • Preparation
    Copy the OS's ISO to the node
Terminal.png frontend:
scp OS_ISO root@NODE:
  • Connect to the node and install tigervnc-viewer, parted and kvm on the system.

First update the packages definition

Terminal.png NODE:
apt-get -y update

Then install the needed packages

Terminal.png NODE:
apt-get install -y tigervnc-viewer parted qemu-kvm
  • Clean the old system's partition /dev/sda3
Terminal.png NODE:
echo -e 'd\n3\nn\np\n\n\nt\n3\n0\nw\n' | fdisk /dev/sda && sync && partprobe /dev/sda
  • Launch the virtual machine, booting on the OS's ISO, using VNC output
Terminal.png NODE:
kvm -drive file=/dev/sda -cpu host -m 1024 -net nic,model=e1000 -net user -k fr -vnc :1 -boot d -cdrom OS_ISO
Note.png Note

This is currently hard to build an image from KVM for nodes that network devices need specific drivers (bnx2, ...)

To be sure your node network device is compatible with the e1000e driver you can check the API using:

Terminal.png frontend:
curl -s -k https://api.grid5000.fr/stable/grid5000/sites/SITE/clusters/CLUSTERS/nodes/NODE

(The node has to be specified by basename: griffon-42.nancy.grid5000.frgriffon-42)

  • Connect to the frontend using SSH X11 forwarding and get the screen of our virtual machine using VNC
Terminal.png laptop:
ssh -Y root@NODE.SITE.g5k 'vncviewer :1'
Note.png Note

If your OS installer is changing the screen resolution, your vncviewer will be closed, you'll just have to relaunch the command to get the screen back

Warning.png Warning

Installation process IMPORTANT instructions:

  • You MUST not format the /dev/sda disk. When you are asked about partitioning the disk, select manual (and not guided).
  • You MUST install your system on /dev/sda3. Create the partition, if it does not exist. The mount point of this partition should be "/"
  • The system size of this partition should be limited to Grid'5000's deployment partition default size (16GiB)
  • You must install an SSH server
  • A bootloader should be installed on the partition /dev/sda3 and not on the MBR

  • Install the system
Note.png Note

after the installation process, the virtual machine will fail to boot, it's normal. You can close vncviewer and kvm

Customize the OS

  • Create the mounting point directory
Terminal.png frontend:
ssh root@NODE 'mkdir -p /mnt/myos'
  • Mount the partition
Terminal.png frontend:
ssh root@NODE 'partprobe'
Terminal.png frontend:
ssh root@NODE 'mount /dev/sda3 /mnt/myos'
Note.png Note

If this command fails, you can try to use this command first: partprobe /dev/sda

  • If you want to :
    • add a custom script that dynamically setup your node's hostname, you can take a look at /mnt/myos/etc/dhcp/dhclient-enter-hooks.d/.
    • disable selinux: example for Centos, put "SELINUX=disable" in /mnt/myos/etc/selinux/config
    • modifying the DNS by editing /etc/resolv.conf
  • Unmount the partition
Terminal.png frontend:
ssh root@NODE 'umount /mnt/myos'

Create a Kadeploy3 image of the filesystem of our OS

  • Save the filesystem in a tgz archive with tgz-g5k
Terminal.png frontend:
tgz-g5k -m NODE -r /dev/sda3 -f IMAGE_FILE.tgz
  • Create an environment description file IMAGE_DESC_FILE such as:
---
name: IMAGE_NAME
version: 1
description: My OS Image
author: me@domain.tld
visibility: private
destructive: false
os: linux
image: 
  kind: tar
  compression: gzip
  file: /path/to/IMAGE_FILE.tgz
postinstalls: 
- archive: server:///grid5000/postinstalls/g5k-postinstall.tgz
  compression: gzip
  script: g5k-postinstall --net debian
boot: 
  kernel: /path/to/the/kernel/in/the/image
  initrd: /path/to/the/initrd/in/the/image
multipart: false
filesystem: ext4
partition_type: 0x83
Note.png Note

You can use Grid'5000 files for the postinstall

Note.png Note

For linux systems, most of times the path to the kernel file is /vmlinuz and the path to the initrd is /initrd.img. You can locate those files by connecting to NODE (like in the previous section using mount) and checking the /mnt/myos directory.

  • Test it !
Terminal.png frontend:
kadeploy3 -a IMAGE_DESC_FILE -m NODE -k
  • Add it to kadeploy (so that you can use the parameter -e IMAGE_NAME like the default g5k environments)
Terminal.png frontend:
kaenv3 -a IMAGE_DESC_FILE

Use disk(s) as I want

In some cases, kadeploy default handling of partitions is too limited and we need to use disks as we want (e.g. to deploy our environment in an optimal way). To do that there are two main ways:

  • simply deploy on another existing partition (sda2 or sda5)
  • repartition disks entirely and/or use several disks (such as sdb or sdc on hercule cluster)

Deploy on sda2 or sda5

First, as this kind of deployment will break node standard operation, you must tell to OAR that it should be redeployed entirely after the reservation with the -t destructive option:

Terminal.png frontend:
oarsub -t deploy -t destructive -l nodes=1,walltime=1 -p "cluster='hercule'" -I

Then you can deploy on sda2 or sda5 with the -p [2,5] option:

Terminal.png frontend:
kadeploy3 -e debian10-x64-nfs -f $OAR_NODEFILE -p 2 -k

Deploy on additional disks

Warning.png Warning

Currently broken

First, as this kind of deployment will break node standard operation, you must tell to OAR that it should be redeployed entirely after the reservation with the -t destructive option:

Terminal.png frontend:
oarsub -t deploy -t destructive -l nodes=1,walltime=1 -p "cluster='hercule'" -I

Then you can deploy on an additional disk such as sdb with the -b sdb option:

Terminal.png frontend:
kadeploy3 -e debian10-x64-base -f $OAR_NODEFILE -b sdb -k

Format additional disks

In Kadeploy3, we can easily customize the deployment's automata. It's possible to add custom pre, post or substitute operations to each steps. In a custom operation it's possible to: send a file, execute a command or run a script.

This feature is explained in Kadeploy3's documentation (available on Kadeploy3's website) in the section 4.2.2, Use Case 10 and 4.7.

In this example, we will add some custom operations to the deployment workflow: our nodes have two additional hard disks and we want them to be formated during the deployment process.

We want to a new partition scheme such as:

  • classical grid5000 partitioning on sda
  • data1 ext4 on sdb1
  • data2 ext2 on sdc1

The three following sections describe how to perform such an operation.

1. Make the reservation in destructive mode

First of all, when you do your reservation, you must tell to OAR that it should redeploy the node entirely after the reservation with the -t destructive parameter:

Terminal.png frontend:
oarsub -t deploy -t destructive -l nodes=1,walltime=2 -p "cluster='hercule'" -I

2. Describe the custom operations

After that you have to create a file that describe the custom operations you want to be performed during the deployment. In our example we will first repartition the additional disks (using parted) and then format them (using the script format.sh).

  • The operation description file (let's say custom-partitioning.yml) should look like something like this:
---
# Our custom steps should be performed during the SetDeploymentEnv macro-step
SetDeploymentEnvUntrusted:
  # Custom partitioning step, done after the create_partition_table micro-step
  # In the sample this step is exploded in 4 steps but it can be done in 1 using a single parted command
  create_partition_table:
      post-ops:
        # We send a file on the node
        - action: send
          file: sdb.parted
          # The variable $KADEPLOY_TMP_DIR will be substitued by kadeploy
          destination: $KADEPLOY_TMP_DIR 
          name: send_partition_map_sdb
        # Then we execute the parted command using the previously sent file
        - action: exec
          name: partitioning_sdb
          # The variable $KADEPLOY_TMP_DIR will be substitued by kadeploy
          command: parted -a optimal /dev/sdb --script $(cat $KADEPLOY_TMP_DIR/sdb.parted)
        # Same operation for the second disk
        - action: send
          file: sdc.parted
          destination: $KADEPLOY_TMP_DIR 
          name: send_partition_map_sdc
        - action: exec
          name: partitioning_sdc
          command: parted -a optimal /dev/sdc --script $(cat $KADEPLOY_TMP_DIR/sdc.parted)
  # Custom format step, done after the format_deploy_part micro-step
  format_deploy_part:
      post-ops:
        # We run the script contained in the file 'format.sh'
        - action: run 
          name: format_disks
          file: format.sh
  • The file sdb.parted will look like something like this:
mklabel msdos
u GB mkpart primary ext4 0% 100%
align-check optimal 1
  • The file sdc.parted will look like something like this:
mklabel msdos
u GB mkpart primary ext2 0% 100%
align-check optimal 1
  • The file format.sh will look like something like this:
#!/bin/sh
set -e
# formating /dev/sdb
mkfs -t ext4 -b 4096 -O sparse_super,filetype,resize_inode,dir_index -q /dev/sdb1
# formating /dev/sdc
mkfs -t ext2 -b 4096 -O sparse_super,filetype,resize_inode,dir_index -q /dev/sdc1

3. Run the deployment

Now you can deploy you environment with this custom operation:

Terminal.png frontend:
kadeploy3 -e debian10-x64-min -f $OAR_NODE_FILE -k --custom-steps ./custom-partitioning.yml
Warning.png Warning

In some cases you should increase the step's timeout (for some long formatting for example) see Advanced_Kadeploy#Adjusting timeout for some environments for details.

Note: Both partitions are not mounted on boot. To mount those partitions you should do:

Terminal.png NODE:
mkdir -p /media/data1
Terminal.png NODE:
mkdir /media/data2
Terminal.png NODE:
mount /dev/sdb1 /media/data1
Terminal.png NODE:
mount /dev/sdc1 /media/data2

Use a custom partitioning scheme

Example 1 : Deploy on the whole disk

In Kadeploy3, we can customize the deployment's automata. It's possible to add custom pre, post or substitute operations to each steps. In a custom operation it's possible to: send a file, execute a command or run a script. In this example, we will modify the deployment workflow to deploy the system on a unique disk partition ( '/' on sda1 )

1. Make the reservation in destructive mode

As you will change partitioning of the disk, you must tell to OAR that it should redeploy the node entirely after the reservation with the -t destructive parameter:

Terminal.png frontend:
oarsub -t deploy -t destructive -l nodes=1,walltime=2 -I

2. Describe the custom operations

After that you have to create a file that describe the custom operations you want to be performed during the deployment. In this example we will create our custom partitioning scheme and bypass some steps that are not necessary to deploy the system on a unique partition.

  • The operation description file (let's say custom-partitioning.yml) should look like something like this:
---
# Our custom steps should be performed during the SetDeploymentEnv macro-step
SetDeploymentEnvUntrusted:
  # Custom partitioning step that is substituted to the create_partition_table micro-step
  create_partition_table:
    substitute:
      # We send a file on the node
      - action: send
        file: map.parted
        # The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy
        destination: $KADEPLOY_TMP_DIR
        name: send_partition_map
      # Then we execute the parted command using the previously sent file
      - action: exec
        name: partitioning
        # The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy
        command: parted -a optimal /dev/sda --script $(cat $KADEPLOY_TMP_DIR/map.parted)
# Hack to disable useless steps
  format_tmp_part:
    substitute:
      - action: exec
        name: remove_format_tmp_part_step
        command: /bin/true
  format_swap_part:
    substitute:
      - action: exec
        name: remove_format_swap_part_step
        command: /bin/true
  • The file map.parted, which will be passed to parted, will look like this:
mklabel gpt
mkpart partition-system ext4 0% 100%
toggle 1 boot
align-check optimal 1

3. Customize the environment's postinstall. In order for our new partitions to be mounted at boot time we will modify the Grid'5000 postinstall files.

  • Create and go in your public directory:
Terminal.png frontend:
mkdir public/custom-postinstall && cd public/custom-postinstall
  • Then decompress the postinstall archive:
Terminal.png frontend:
tar xzf /grid5000/postinstalls/g5k-postinstall.tgz
  • Add your custom /etc/fstab file in this directory, named fstab:
/dev/sda1       /          ext4    defaults 1      2

When you will pass "--fstab custom" option to the postinstall, it will copy this file in /etc/fstab

  • Regenerate the postinstall archive:
Terminal.png frontend:
tar -czvf ~/public/g5k-postinstall-custom.tgz *
  • Make some cleanup:
  • Create the environment's description file (let's say custom-env.dsc) based on the reference one:
    • use kaenv3 -p debian10-x64-base to have an example of environment description.

Your custom-env.dsc should look like this :

--- 
name: custom-env
version: 1
description: Custom env based on Debian 10
author: me@domain.tld
visibility: shared
destructive: true
os: linux
image:
  file: server:///grid5000/images/debian10-x64-base-2020012812.tgz
  kind: tar
  compression: gzip
postinstalls:
- archive: http://public/~<login>/g5k-postinstall-custom.tgz
  compression: gzip
  script: g5k-postinstall --net debian --fstab custom
boot:
  kernel: "/vmlinuz"
  initrd: "/initrd.img"
filesystem: ext4
partition_type: 131
multipart: false

4. Run the deployment

Finally, we deploy our custom environment with your custom operations:

Terminal.png frontend:
kadeploy3 -a custom-env.dsc -f $OAR_NODE_FILE -p 1 -k --custom-steps custom-partitioning.yml
Note.png Note

In some case you should increase the step's timeout (for some long formatting for example) see Advanced_Kadeploy#Adjusting timeout for some environments for details.

Example 2 : Deploy on multiple partitions

In Kadeploy3, we can customize the deployment's automata. It's possible to add custom pre, post or substitute operations to each steps. In a custom operation it's possible to: send a file, execute a command or run a script. In this example, we will modify the deployment workflow: a different partition will be used for each of the /, /home, /opt and /tmp directories. Imagine that you want to make your own partitioning scheme like that:

Mount point Partition Disk space File System
swap /dev/sda1 2G linux-swap
/ /dev/sda2 18G ext4
/var /dev/sda3 30G ext4
/opt /dev/sda4 20G ext4
/tmp /dev/sda5 everything else ext4

The four following sections describe how to perform such an operation.

1. Make the reservation in destructive mode

First of all, when you do your reservation, you must tell to OAR that it should redeploy the node entirely after the reservation with the -t destructive parameter:

Terminal.png frontend:
oarsub -t deploy -t destructive -l nodes=1,walltime=2 -I

2. Describe the custom operations

After that you have to create a file that describe the custom operations you want to be performed during the deployment. In our example we will first create apply our custom partitioning scheme, format the partition and the mount them.

  • The operation description file (let's say custom-partitioning.yml) should look like something like this:
---
# Our custom steps should be performed during the SetDeploymentEnv macro-step
SetDeploymentEnvUntrusted:
  # Custom partitioning step that is substituted to the create_partition_table micro-step
  create_partition_table:
    substitute:
      # We send a file on the node
      - action: send
        file: map.parted
        # The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy
        destination: $KADEPLOY_TMP_DIR 
        name: send_partition_map
      # Then we execute the parted command using the previously sent file
      - action: exec
        name: partitioning
        # The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy
        command: parted -a optimal /dev/sda --script $(cat $KADEPLOY_TMP_DIR/map.parted)
  # Custom format step, done after the format_deploy_part micro-step
  format_deploy_part:
    post-ops:
      # We run the script contained in the file 'format.sh'
      - action: run 
        name: format_partitions
        file: format.sh
  # Custom mount step, done after the mount_deploy_part micro-step
  mount_deploy_part:
    post-ops:
      # We run the script contained in the file 'format.sh'
      - action: run 
        name: mount_partitions
        file: mount.sh
# Hack to disable useless steps
  format_tmp_part:
    substitute:
      - action: exec
        name: remove_format_tmp_part_step
        command: /bin/true
  format_swap_part:
    substitute:
      - action: exec
        name: remove_format_swap_part_step
        command: /bin/true
Note.png Note

In order for Kadeploy to be able to perform the installation correctly, every partitions have to be mounted before the installation process which is done in the macro-step BroadcastEnv

  • The file map.parted will look like something like this:
mklabel gpt
u GB mkpart partition-swap linux-swap 0% 2
u GB mkpart partition-system ext4 2 20
u GB mkpart partition-var ext4 20 50
u GB mkpart partition-opt ext4 50 70
u GB mkpart partition-tmp ext4 70 100%
toggle 2 boot
align-check optimal 1
align-check optimal 2
align-check optimal 3
align-check optimal 4
align-check optimal 5
  • The file format.sh will look like something like this:
#!/bin/sh
set -e

mkfs_opts="sparse_super,filetype,resize_inode,dir_index"
ext4_blocksize="4096"

# create swap
mkswap ${KADEPLOY_BLOCK_DEVICE}1
# / will be formated by Kadeploy since we will precise the -p 2 option
# formating /var
mkfs -t ext4 -b ${ext4_blocksize} -O ${mkfs_opts} -q ${KADEPLOY_BLOCK_DEVICE}3
# formating /opt
mkfs -t ext4 -b ${ext4_blocksize} -O ${mkfs_opts} -q ${KADEPLOY_BLOCK_DEVICE}4
# formating /tmp
mkfs -t ext4 -b ${ext4_blocksize} -O ${mkfs_opts} -q ${KADEPLOY_BLOCK_DEVICE}5
Note.png Note

When running a custom script, Kadeploy will export different variables, you can get a list of them by running "kadeploy -i".

  • The file mount.sh will look like something like this:
#!/bin/sh
set -e

# / will be mounted in ${KADEPLOY_ENV_EXTRACTION_DIR} by Kadeploy
# mount /var
mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/var
mount ${KADEPLOY_BLOCK_DEVICE}3 ${KADEPLOY_ENV_EXTRACTION_DIR}/var/
# mount /opt
mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/opt
mount ${KADEPLOY_BLOCK_DEVICE}4 ${KADEPLOY_ENV_EXTRACTION_DIR}/opt/
# mount /tmp
mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/tmp
mount ${KADEPLOY_BLOCK_DEVICE}5 ${KADEPLOY_ENV_EXTRACTION_DIR}/tmp/

3. Customize the environment's postinstall.

In order for our new partitions to be mounted at boot time we can modify the Grid'5000 postinstall files (this customization can also be done by adding another custom operation).

  • Create and go in a temporary directory:
Terminal.png frontend:
tmpdir=$(mktemp -d) && export tmpdir && pushd $tmpdir
  • Then decompress the postinstall archive:
Terminal.png frontend:
tar xzf /grid5000/postinstalls/g5k-postinstall.tgz
Note.png Note

We assume that the current shell is BASH, if not please replace the "export" instruction

  • Add your custom /etc/fstab file in this temporary directory, named fstab:
/dev/sda1       none          swap    sw       0      0
/dev/sda3       /var          ext4    defaults 1      2
/dev/sda4       /opt          ext4    defaults 1      2
/dev/sda5       /tmp          ext4    defaults 1      2

/ will be added by Kadeploy since we will precise the -p 2 option

  • Regenerate the postinstall archive:
Terminal.png frontend:
tar -czvf ~/g5k-postinstall-custom.tgz *
  • Make some cleanup:
Terminal.png frontend:
popd && rm -R $tmpdir
  • Create the environment's description file (let's say custom-env.yml) based on the reference one:
Terminal.png frontend:
kaenv3 -p debian10-x64-base -u deploy | sed -e "s/archive:.*$/archive: \/home\/${USER}\/g5k-postinstall-custom.tgz/" -e 's/public/shared/' > custom-env.yml

and customize the custom-env.yml file to suit your needs (especially your archive path):

--- 
name: custom-env
version: 1
description: Custom env based on Debian 10
author: me@domain.tld
visibility: shared
destructive: true
os: linux
image:
  file: server:///grid5000/images/debian10-x64-base-2019100414.tgz
  kind: tar
  compression: gzip
postinstalls:
- archive: /home/me/g5k-postinstall-custom.tgz
  compression: gzip
  script: g5k-postinstall --net debian --fstab custom
boot:
  kernel: "/vmlinuz"
  initrd: "/initrd.img"
filesystem: ext4
partition_type: 131
multipart: false
Warning.png Warning

Do not forget the --fstab custom option to g5k-postinstall.

4. Run the deployment

Finally, we deploy our custom environment with your custom operations:

Terminal.png frontend:
kadeploy3 -a custom-env.yml -f $OAR_NODE_FILE -p 2 -k --custom-steps custom-partitioning.yml
Note.png Note

In some case you should increase the step's timeout (for some long formatting for example) see Advanced_Kadeploy#Adjusting timeout for some environments for details.

Tuning the Kadeploy3 deployment workflow

kadeploy3 allows to fully modify the deployment workflow.

First of all you have to understand the different steps of a deployment. There are 3 macro-steps:

  1. SetDeploymentEnv: this step aims at setting up the deployment environment that contains all the required tools to perform a deployment ;
  2. BroadcastEnv: this step aims at broadcasting the new environment to the nodes and writing it to disk;
  3. BootNewEnv: this step aims at rebooting the nodes on their new environment.

kadeploy3 provides several implementations for each of those 3 macro-steps. You can consult that list in the kadeploy3 page. In Grid'5000, we use the following steps by default in all our clusters :

  • SetDeploymentEnv -> SetDeploymentEnvUntrusted : use an embedded deployment environment
  • BroadcastEnv -> BroadcastEnvKascade : use the Kascade tool to broadcast the environment
  • BootNewEnv -> BootNewEnvKexec : the nodes use kexec to reboot (if it fails, a BootNewEnvClassical, classical reboot, will be performed)

Each one of these implementations is divided in micro steps. You can can see the name of those micro-steps if you use the kadeploy3 option --verbose-level 4. And to see what is actually executed during those micro-steps you can add the debug option of kadeploy3 -d

Terminal.png frontend:
kadeploy3 -f $OAR_FILE_NODES -k -e debian10-x64-base --verbose-level 4 -d > ~/kadeploy3_steps

This command will store the kadeploy3 standard output in the file ~/kadeploy3_steps. Lets analyse its content:

Terminal.png frontend:
grep "Time in" ~/kadeploy3_steps

This command will print on the terminal all the micro-steps executed during the deployment process, and the time spent for each execution. Here are the micro-steps that you should see:

  1. SetDeploymentEnvUntrusted-switch_pxe: Configures the PXE server so that this node will boot on an environment that contains all the required tools to perform the deployment,
  2. SetDeploymentEnvUntrusted-reboot: Sends a reboot signal to the node
  3. SetDeploymentEnvUntrusted-wait_reboot: Waits for the node to restart.
  4. SetDeploymentEnvUntrusted-send_key_in_deploy_env: Sends kadeploy's user's ssh public key into the node's authorized_keys to ease the following ssh connections,
  5. SetDeploymentEnvUntrusted-create_partition_table: Creates the partition table
  6. SetDeploymentEnvUntrusted-format_deploy_part: Format the partition where your environment will be installed. This partition is by default /dev/sda3
  7. SetDeploymentEnvUntrusted-mount_deploy_part: Mounts the deployment partition in a local directory.
  8. SetDeploymentEnvUntrusted-format_tmp_part: Format the partition defined as tmp (by default, /dev/sda5)
  9. SetDeploymentEnvUntrusted-format_swap_part: Format the swap partition
  10. BroadcastEnvKascade-send_environment: Sends your environments into the node and untar it into the deployment partition.
  11. BroadcastEnvKascade-manage_admin_post_install: Execute post installation instructions defined by the site admins, in general to adapt to the specificities of the cluster: console baud rate, Infiniband,...
  12. BroadcastEnvKascade-manage_user_post_install: Execute user defined post installation instructions to automatically configure its node depending on its cluster, site, network capabilities, disk capabilities,...
  13. BroadcastEnvKascade-send_key: Sends the user public ssh key(s) to the node (if the user specified it with the option -k).
  14. BroadcastEnvKascade-install_bootloader: Properly configures the bootloader
  15. BootNewEnvKexec-switch_pxe: Configure the PXE server so that this node will boot on the partition where your environment has been installed
  16. BootNewEnvKexec-umount_deploy_part : Umount the deployment partition from the directory where it has been mounted during the step 7.
  17. BootNewEnvKexec-mount_deploy_part : ReMount the deployment partition
  18. BootNewEnvKexec-kexec: Perform a kexec reboot on the node
  19. BootNewEnvKexec-set_vlan: Properly configure the node's VLAN
  20. BootNewEnvKexec-wait_reboot: Wait for the node to be up.

That is it. You now know all the default micro-steps used to deploy your environments.

Note.png Note

It is recommended to consult the Grid5000:Node storage page to understand which partition is used at which step.

Adjusting timeout for some environments

Since kadeploy3 provides multiple macro-steps and micro-steps, its is important to detect when a step in failing its execution. This error detection is done by using timeout on each step. When a timeout is reached, the nodes that have not completed the given step are discarded from the deployment process.
The value of those timeouts varies from one cluster to another since they depend on the hardware configuration (network speed, hard disk speed, reboot speed, ...). All defaults timeouts are entered in the configurations files on the kadeploy3 server. But you can consult the default timeouts of each macro-steps by using the command kastat3

Terminal.png frontend:
kastat3 -I
 Kadeploy server configuration:
 Custom PXE boot method: PXElinux
 Automata configuration:
   hercule:
     SetDeploymentEnv: SetDeploymentEnvUntrusted,1,600
     BroadcastEnv: BroadcastEnvKascade,0,1000
     BootNewEnv: BootNewEnvKexec,0,180; BootNewEnvHardReboot,0,900
   nova:
     SetDeploymentEnv: SetDeploymentEnvUntrusted,1,600
     BroadcastEnv: BroadcastEnvKascade,0,1000
     BootNewEnv: BootNewEnvKexec,0,150; BootNewEnvHardReboot,0,600
 ...


kadeploy3 allow users to change timeouts in the command line. In some cases, when you try to deploy an environment with a large tarball or with a post-install that lasts too long, you may get discarded nodes. This false positive behavior can be avoided by manually modifying the timeouts for each step at the deployment time.

For instance, in our previous example, the timeout of each steps are:

  • SetDeploymentEnvUntrusted: 143
  • BroadcastEnvKascade: 111
  • BootNewEnvKexec: 33

You can increase the timeout of the second step to 1200 seconds with the following command :

Terminal.png frontend:
kadeploy3 -e my_big_env -f $OAR_FILE_NODES -k --force-steps "SetDeploymentEnv|SetDeploymentEnvUntrusted:1:450&BroadcastEnv|BroadcastEnvKascade:1:1200&BootNewEnv|BootNewEnvClassical:1:400"

Set Break-Point during deployment

As mentioned in the section above, a deployment is a succession of micro steps that can be consulted and modified.
Moreover, kadeploy3 allows user to set a break-point during deployment.

Terminal.png frontend:
kadeploy3 -f $OAR_FILE_NODES -k -e debian10-x64-base --verbose-level 4 -d --breakpoint BroadcastEnvKascade:manage_user_post_install

This command can be used for debugging purpose. It performs a deployment with the maximum verbose level and it asks to stop the deployment workflow just before executing the manage_user_post_install micro-step of the BroadcastEnvKascade macro-step. Thus you will be able to connect in the deployment environment and to manually run the user post install script to debug it.

Warning.png Warning

At the current state of kadeploy3, it is not possible to resume the deployment from the break-point step. Thus you will have to redeploy you environment from the first step. This feature will be implemented in future version of kadeploy3.

Modify the deployment workflow with custom operations

In Kadeploy3, we can easily customize the deployment's automata. It's possible to add custom pre, post or substitute operations to each steps. In a custom operation it's possible to: send a file, execute a command or run a script.

This feature in explained in Kadeploy3's documentation (available on Kadeploy3's website) in the section 4.2.2, Use Case 10 and 4.7.

You can find examples of deployment workflow tuning in the following sections:

Note.png Note

When running a custom script, Kadeploy will export different variables, you can get a list of them by running kadeploy3 -I.
A description of each of this variables is available in Kadeploy3's documentation (on Kadeploy3 website) in the section 4.4

Customizing the postinstalls

In Kadeploy3, postinstalls are scripts that are executed after the copy of the image file in order to customize site-specific or cluster-specific aspects. Since the beginning on 2018, on Grid'5000 the same postinstall script (called g5k-postinstall) is used for all reference environments (and is thus compatible with all supported Debian versions and distributions). That script takes parameters in order to define its behaviour (for example, to choose the style of network configuration to use).

The source code for g5k-postinstall is available on gitlab.inria.fr. Its parameters at the time of writing are:

Usage: kadeploy3 [options]
Contact: kadeploy3-users@lists.gforge.inria.fr

Generic options:
 -h, --help                   Show this message
 -v, --version                Get the server's version
 -I, --server-info            Get information about the server's configuration
 -M, --multi-server           Activate the multi-server mode
 -H, --[no-]debug-http        Debug HTTP communications with the server (can be redirected to the fd #4)
 -S, --server STRING          Specify the Kadeploy server to use
     --[no-]dry-run           Perform a dry run
     --password [PASSWORD]    Provide a password for HTTP Basic authentication (read from STDIN if not
                              specified)
 -d, --[no-]debug-mode        Activate the debug mode (can be redirected to the fd #3)
 -f, --file MACHINELIST       A file containing a list of nodes (- means stdin)
 -m, --machine MACHINE        The node to run on
 -n FILENAME,                 File that will contain the nodes on which the operation has not been
                              correctly performed
     --output-ko-nodes
 -o FILENAME,                 File that will contain the nodes on which the operation has been correctly
                              performed
     --output-ok-nodes
 -s, --script FILE            Execute a script at the end of the operation
 -V, --verbose-level VALUE    Verbose level (between 0 to 5)
     --write-workflow-id FILE Write the workflow id in a file
     --[no-]wait              Wait the end of the operation, set by default
     --[no-]force             Allow to deploy even on the nodes tagged as currently used (use this only
                              if you know what you do)
     --breakpoint STEP        Set a breakpoint just before lauching the given micro-step, the syntax is
                              macrostep:microstep (use this only if you know what you do)
     --custom-steps FILE      Add some custom operations defined in a file
     --[no-]hook              Launch server's hook at the end of operation, disabled by default

General options:
 -a, --env-file ENVFILE       File containing the environment description
 -b BLOCKDEVICE,              Specify the block device to use
     --block-device
 -c, --boot-partition NUMBER  Specify the number of the partition to boot on (use 0 to boot on the MBR)
 -e, --env-name ENVNAME       Name of the recorded environment
 -k, --key [FILE]             Public key to copy in the root's authorized_keys, if no argument is
                              specified, use ~/.ssh/authorized_keys
 -p NUMBER,                   Specify the partition number to use
     --partition-number
 -r, --reformat-tmp FSTYPE    Reformat the /tmp partition with the given filesystem type (this
                              filesystem need to be supported by the deployment environment)
 -u, --env-user USERNAME      Specify the user that own the recorded environment
     --vlan VLANID            Set the VLAN
 -w, --set-pxe-profile FILE   Set the PXE profile (use with caution)
     --set-pxe-pattern FILE   Specify a file containing the substitution of a pattern for each node in
                              the PXE profile (the NODE_SINGULARITY pattern must be used in the PXE
                              profile)
 -x, --upload-pxe-files FILES Upload a list of files (file1,file2,file3) to the PXE repository. Those
                              files will then be available with the prefix FILES_PREFIX--
     --env-version NUMBER     Version number of the recorded environment

Advanced options:
     --no-kexec               Disable kexec reboots during the deployment process
     --disable-bootloader-install
                              Disable the automatic installation of a bootloader for a Linux based
                              environnment
     --disable-disk-partitioning
                              Disable the disk partitioning
     --reboot-classical-timeout VALUE
                              Overload the default timeout for classical reboots (a ruby expression can
                              be used, 'n' will be replaced by the number of nodes)
     --reboot-kexec-timeout VALUE
                              Overload the default timeout for kexec reboots (a ruby expression can be
                              used, 'n' will be replaced by the number of nodes)
     --force-steps STRING     Undocumented, for administration purpose only
     --[no-]secure            Use a secure connection to export files to the server

An example environment description using g5k-postinstall is:

$ kaenv3 -p debian10-x64-min
---
name: debian10-x64-min
version: 2019100414
description: debian 10 (buster) - min
author: support-staff@list.grid5000.fr
visibility: public
destructive: false
os: linux
image:
  file: server:///grid5000/images/debian10-x64-min-2019100414.tgz
  kind: tar
  compression: gzip
postinstalls:
- archive: server:///grid5000/postinstalls/g5k-postinstall.tgz
  compression: gzip
  script: g5k-postinstall --net debian
boot:
  kernel: "/vmlinuz"
  initrd: "/initrd.img"
filesystem: ext4
partition_type: 131
multipart: false

Things that you can do are:

  • Use a customized postinstall script: just point the postinstalls/archive/ field to another tar archive. (see README.md and TechTeam:Postinstalls for details)
  • Use different parameters to change the behaviour of the postinstall. Example parameters for various situations are:
    • Debian min environment with traditional NIC naming: g5k-postinstall --net debian --net traditional-names
    • Debian min environment with predictable NIC naming: g5k-postinstall --net debian
    • Debian NFS environment (mount /home, setup LDAP, restrict login to user who reserved the node): g5k-postinstall --net debian --fstab nfs --restrict-user current
    • Debian big environment (NFS + setup HPC networks and mount site-specific directories): g5k-postinstall --net debian --net traditional-names --net hpc --fstab nfs --fstab site-specific
    • RHEL/Centos style for network configuration: g5k-postinstall --net redhat --net traditional-names
    • Ubuntu 1710 or later: NetPlan for network configuration: g5k-postinstall --net netplan
    • Do not do any network configuration (useful for Gentoo), but force serial console settings: g5k-postinstall --net=none --inittab='s0:12345:respawn:/sbin/agetty -L SPEED TTYSX vt100'

Multisite deployment

In order to do a deployment on nodes from differents sites you just have to use the multiserver option of kadeploy : -M So to do a deployment:

kadeploy -M -f file_with_all_nodes -e debian10-x64-std