SSH and Grid'5000
SSH is a widely used network protocol to establish a secure communication channel to a remote machine. Its most common use is to get shell access (be able to type command-line commands) on a remote server. But it can also be used to transfer files (using SCP, SFTP or RSYNC), connect securely to network services on the remote network (using SSH tunnels or a SSH-based VPN), and even execute graphical applications remotely (using X forwarding). You will find many tutorials on internet on how it works and on how to configure the software to use on your operating system of choice so as to be able to connect to machines using ssh. This tutorial is designed to teach you the basics as recommended for use in the Grid'5000 context.
- 1 SSH basics for Grid'5000
- 2 Generating keys for use with Grid'5000
- 3 Basic configuration for Grid'5000
- 4 Configure your SSH environment on all sites
- 5 More advanced SSH
SSH basics for Grid'5000
In Grid'5000, ssh is used with a key-based authentication mechanism to gain access to the platform, and ssh keys are the recommended authentication method even inside Grid'5000. Establishing secure connections involve a few steps that are (badly) summarized in the following schema and commented below:
- In the first step, you instruct a client program (here
sshto connect to a given
login. You need to use the login given to you when registering as a Grid'5000 user when connecting to Grid'5000. This step triggers the next steps
- in the second step, the ssh client will attempt (default behaviour) to check that the server answering the request is really the server you asked for. If it has never connected to that server, it will (by default) ask you to confirm that the certificate presented by the server is the one you expect by presenting is fingerprint. If you confirm, it will store the server, the IP of the server and the fingerprint in an internal database. Should the server certificate change (because traffic is highjacked, or because the server was changed without taking care of keeping its certificate), your ssh client will warn you something suspect is happening. You will need to investigate and either update the internal database (in generale, the known_hosts file) or decide to abort the connection attempt until you are certain nothing suspicious is happening.
- Once the server is authenticated, you will need to prove you are user
login. This is done by repeatedly (servers are by default configured to abort after 5 attempts) by
- sending (one of) your public keys. Unless you have explicitly (option -i for openssh) specified the key to use, the ssh client will look for reasonable keys to present to the server. In general, keys stored in ~/.ssh and named id_rsa or id_dsa are chosen first, then the others in the directory.
- the server will answer by sending a challenge to check you own the private part of the key
- the client will attempt to answer the challenge by either
- requesting the passphrase to decrypt the private key directly
- delegating the task of answering to the challenge to a agent provided by the OS (in general
ssh-agent). This agent might have loaded and deciphered the private key when you opened a session on your work machine, or might in turn request the passphrase. In general, agents keep all the information needed to answer other challenges for the key until they are restarted. They might store that information securely using OS mechanisms so it is available even after the next reboot.
- if the challenge is met successfully, proving the are a legitimate owner of the key-pair, then it will check if user login is allowed access with that key-pair by checking if the public key is associated to the login account. List of authorized_keys for user
loginis in general stored in
~login/.ssh/authorized_keys. Management of this file by the user himself is not always possible directly. This depends on the security policy for the server.
- Finally, if the user is authenticated, the ssh session is opened. The ssh session might open a shell, perform commands, give access sftp for file management, copy files, etc...
Generating keys for use with Grid'5000
Linux and Mac users
Most Linux distribution and Mac OS variants come with the
ssh-keygen command line utilities. Generating a key-pair is simply a matter of running the
ssh-keygen and supplying a passphrase. To access Grid'5000 using ssh, you must use a key-pair that whose private key is passphrase protected.
The basic steps for windows users are well documented in the page Inria maintains for gforge users. You'll only need to remember that the key you generate don't need uploaded to InriaGforge, but to the account management form or the account request form.
Basic configuration for Grid'5000
After having looked at the explanations above, you might be wondering how servers can get a copy of your public key if you can not connect to them to put it there. In Grid'5000, the answer depends on the type of server.
Access machine configuration
Access machines update the repository of trusted keys they use every 5 min from the user database. This is where keys entered though the account management form and the account request form are stored. The following command should therefore connect you to Grid'5000 if you have access to a
ssh command and that it has access to the private part of the public key you uploaded on the account management form. If not, please refer to the ssh access from windows paragraph.
You might be using more than one key, or have given a specific name to the key used for Grid'5000. In this case, you must be particularly careful about access rights given to the key and the directory it is stored in (see
.ssh access rights). Please use:
You should be able to check the list of keys configured from your account with the following command.
Site frontend machines
Looking at the general configuration for your account on the access machine, you'll find some additional files:
id_rsa.pub files are a passphrase-less pair of keys that where automatically configured for you when your account was created. (Since May 12th, 2014. If your account was created beforehand, you'll need to replicate that behaviour yourself, as documented below in #Configure_your_SSH_environment_on_all_sites). The public part of that key was added to the repository of trusted keys for all access machines, as you can see when running
The configuration allows you to connect to site frontends using either the key generated for you (internal key), or the key used to access the access machine (external key). The recommended configuration is to passphrase-protect the external key, but not the internal key.
When you have reserved machines, you'll be in one of the following 3 cases
- you did not request a
deployjob type : oar will have generated a keypair for your job, that will be used transparently for you if you are using
oarsh, a wrapper for
- you requested a
classical_sshjob type: the ssh config on nodes is to use the .ssh directory your homedir as trusted sources of authorized keys, and ssh configuration is dynamically updated to allow you to connect.
- you requested a
deployjob type. Nothing has been done to allow you to connect on nodes. When you deploy an environment, the
kadeploy3will copy the authorized_keys of your home directory on the site to the ssh configuration of the root account of the deployed nodes.
Configure your SSH environment on all sites
This step only needs to be done once. Since May 12th, 2014, this step is done automatically for freshly created accounts. Therefore, most people discovering Grid'5000 can skip this section.
Create an new SSH key with ssh-keygen
In the next two steps of this tutorial, we will use the fact that homes for all sites are available from
access machines to propagate your SSH key to each site. First, we will prepare your SSH configuration for one site, then we will copy it to all other sites.
Prepare your SSH configuration on one site
We will prepare the
grenoble site, then duplicate its configuration everywhere.
First, copy your new SSH keys to your
.ssh directory in grenoble:
Now, add your new SSH public key to grenoble's
Push your SSH configuration to all sites
We will use a shell trick to copy your SSH configuration to all sites, and to the other access machine:
An error message about
grenoble is normal.
Use SSH to connect to another site
If the previous step of this tutorial was done correctly, you should be able to connect without a password.
The figure below shows how you just connected from your local machine to
access, and then to the site frontend in
nancy. Site frontends (named
.grid5000.fr or simply
.grid5000.fr) are the machines you will use to interact with Grid'5000 tools such as OAR and Kadeploy. Those machines are virtual machines, and must not be used for CPU or I/O intensive tasks (nodes must be reserved and used instead).
More advanced SSH
Depending on your client, you will be able to configure ssh to use more fancy options. If you are using Linux, Mac OS X, or another Unix-based system, the SSH page and the FAQ will give you some recipes to tune your configuration. Not all ssh clients (not putty in particular) allow for such flexible options, but most will have equivalents.
- you will be able to store the login name to use, and the key to use to access Grid'5000
- you will be able to connect to any machine inside Grid'5000 in one shot, using
.g5k. See this page for details. This is also the way to go to transfer large volumes of data to a site.
- you will be able to open TCP tunnels to connect from your machine to services running inside Grid'5000
- you will be able to avoid man-in-the middle warnings, host key checking (in the FAQ#SSH_related_questions)