SSH and Grid'5000

From Grid5000
Jump to: navigation, search


SSH is a widely used network protocol to establish a secure communication channel to a remote machine. Its most common use is to get shell access (be able to type command-line commands) on a remote server. But it can also be used to transfer files (using SCP, SFTP or RSYNC), connect securely to network services on the remote network (using SSH tunnels or a SSH-based VPN), and even execute graphical applications remotely (using X forwarding). You will find many tutorials on internet on how it works and on how to configure the software to use on your operating system of choice so as to be able to connect to machines using ssh. This tutorial is designed to teach you the basics as recommended for use in the Grid'5000 context.

SSH basics for Grid'5000

In Grid'5000, ssh is used with a key-based authentication mechanism to gain access to the platform, and ssh keys are the recommended authentication method even inside Grid'5000. Establishing secure connections involve a few steps that are (badly) summarized in the following schema and commented below:

Grid5000 Access
  1. In the first step, you instruct a client program (here ssh to connect to a given server as user login. You need to use the login given to you when registering as a Grid'5000 user when connecting to Grid'5000. This step triggers the next steps
  2. in the second step, the ssh client will attempt (default behaviour) to check that the server answering the request is really the server you asked for. If it has never connected to that server, it will (by default) ask you to confirm that the certificate presented by the server is the one you expect by presenting is fingerprint. If you confirm, it will store the server, the IP of the server and the fingerprint in an internal database. Should the server certificate change (because traffic is highjacked, or because the server was changed without taking care of keeping its certificate), your ssh client will warn you something suspect is happening. You will need to investigate and either update the internal database (in generale, the known_hosts file) or decide to abort the connection attempt until you are certain nothing suspicious is happening.
  3. Once the server is authenticated, you will need to prove you are user login. This is done by repeatedly (servers are by default configured to abort after 5 attempts) by
    1. sending (one of) your public keys. Unless you have explicitly (option -i for openssh) specified the key to use, the ssh client will look for reasonable keys to present to the server. In general, keys stored in ~/.ssh and named id_rsa or id_dsa are chosen first, then the others in the directory.
    2. the server will answer by sending a challenge to check you own the private part of the key
    3. the client will attempt to answer the challenge by either
      1. requesting the passphrase to decrypt the private key directly
      2. delegating the task of answering to the challenge to a agent provided by the OS (in general ssh-agent). This agent might have loaded and deciphered the private key when you opened a session on your work machine, or might in turn request the passphrase. In general, agents keep all the information needed to answer other challenges for the key until they are restarted. They might store that information securely using OS mechanisms so it is available even after the next reboot.
    4. if the challenge is met successfully, proving the are a legitimate owner of the key-pair, then it will check if user login is allowed access with that key-pair by checking if the public key is associated to the login account. List of authorized_keys for user login is in general stored in ~login/.ssh/authorized_keys. Management of this file by the user himself is not always possible directly. This depends on the security policy for the server.
  4. Finally, if the user is authenticated, the ssh session is opened. The ssh session might open a shell, perform commands, give access sftp for file management, copy files, etc...

Generating keys for use with Grid'5000

Linux and Mac users

Most Linux distribution and Mac OS variants come with the ssh and ssh-keygen command line utilities. Generating a key-pair is simply a matter of running the ssh-keygen and supplying a passphrase. To access Grid'5000 using ssh, you must use a key-pair that whose private key is passphrase protected.

Terminal.png user-machine:
ssh-keygen -t rsa

If you follow the defaults, you now have a public key stored in the ~/.ssh/id_rsa.pub file. You can use it when requesting an account or to update your profile.

Windows users

The basic steps for windows users are well documented in the page Inria maintains for gforge users. You'll only need to remember that the key you generate don't need uploaded to InriaGforge, but to the account management form or the account request form.

Basic configuration for Grid'5000

After having looked at the explanations above, you might be wondering how servers can get a copy of your public key if you can not connect to them to put it there. In Grid'5000, the answer depends on the type of server.

Access machine configuration

Access machines update the repository of trusted keys they use every 5 min from the user database. This is where keys entered though the account management form and the account request form are stored. The following command should therefore connect you to Grid'5000 if you have access to a ssh command and that it has access to the private part of the public key you uploaded on the account management form. If not, please refer to the ssh access from windows paragraph.

Terminal.png outside:
ssh login@access.grid5000.fr

You might be using more than one key, or have given a specific name to the key used for Grid'5000. In this case, you must be particularly careful about access rights given to the key and the directory it is stored in (see .ssh access rights). Please use:

Terminal.png outside:
ssh -i ~/path/to/your/g5k_private_key login@access.grid5000.fr

You should be able to check the list of keys configured from your account with the following command.

Terminal.png access:
cat .ssh/authorized_keys

Site frontend machines

Looking at the general configuration for your account on the access machine, you'll find some additional files:

Terminal.png access:
ls .ssh/

The id_rsa and id_rsa.pub files are a passphrase-less pair of keys that where automatically configured for you when your account was created. (Since May 12th, 2014. If your account was created beforehand, you'll need to replicate that behaviour yourself, as documented below in #Configure_your_SSH_environment_on_all_sites). The public part of that key was added to the repository of trusted keys for all access machines, as you can see when running

Terminal.png access:
cat nancy/.ssh/authorized_keys

The configuration allows you to connect to site frontends using either the key generated for you (internal key), or the key used to access the access machine (external key). The recommended configuration is to passphrase-protect the external key, but not the internal key.

Reserved machines

When you have reserved machines, you'll be in one of the following 3 cases

  • you did not request a classical_ssh or deploy job type : oar will have generated a keypair for your job, that will be used transparently for you if you are using oarsh, a wrapper for ssh.
  • you requested a classical_ssh job type: the ssh config on nodes is to use the .ssh directory your homedir as trusted sources of authorized keys, and ssh configuration is dynamically updated to allow you to connect.
  • you requested a deploy job type. Nothing has been done to allow you to connect on nodes. When you deploy an environment, the -k option of kadeploy3 will copy the authorized_keys of your home directory on the site to the ssh configuration of the root account of the deployed nodes.

Configure your SSH environment on all sites

This step only needs to be done once. Since May 12th, 2014, this step is done automatically for freshly created accounts. Therefore, most people discovering Grid'5000 can skip this section.

Create an new SSH key with ssh-keygen

Terminal.png access:
ssh-keygen

Generating public/private rsa key pair.
Enter file in which to save the key (
/home/login/.ssh/id_rsa):(press Enter)
Enter passphrase (empty for no passphrase):(press Enter)
Enter same passphrase again:(press Enter)
Your identification has been saved in /home/login/.ssh/id_rsa.

Your public key has been saved in /home/login/.ssh/id_rsa.pub.

In the next two steps of this tutorial, we will use the fact that homes for all sites are available from access machines to propagate your SSH key to each site. First, we will prepare your SSH configuration for one site, then we will copy it to all other sites.

Prepare your SSH configuration on one site

We will prepare the grenoble site, then duplicate its configuration everywhere.

First, copy your new SSH keys to your .ssh directory in grenoble:

Terminal.png access:
cp .ssh/id_rsa* grenoble/.ssh/

Now, add your new SSH public key to grenoble's authorized_keys file:

Terminal.png access:
cat .ssh/id_rsa.pub >> grenoble/.ssh/authorized_keys

Push your SSH configuration to all sites

We will use a shell trick to copy your SSH configuration to all sites, and to the other access machine:

Terminal.png access:
for site in $(ls); do cp grenoble/.ssh/* $site/.ssh/; done

An error message about grenoble is normal.

Use SSH to connect to another site

Terminal.png access:
ssh nancy

If the previous step of this tutorial was done correctly, you should be able to connect without a password.

The figure below shows how you just connected from your local machine to access, and then to the site frontend in nancy. Site frontends (named fsite.site.grid5000.fr or simply site.grid5000.fr) are the machines you will use to interact with Grid'5000 tools such as OAR and Kadeploy. Those machines are virtual machines, and must not be used for CPU or I/O intensive tasks (nodes must be reserved and used instead).

Grid5000 Access

More advanced SSH

Depending on your client, you will be able to configure ssh to use more fancy options. If you are using Linux, Mac OS X, or another Unix-based system, the SSH page and the FAQ will give you some recipes to tune your configuration. Not all ssh clients (not putty in particular) allow for such flexible options, but most will have equivalents.

  • you will be able to store the login name to use, and the key to use to access Grid'5000
  • you will be able to connect to any machine inside Grid'5000 in one shot, using ssh machine.g5k. See this page for details. This is also the way to go to transfer large volumes of data to a site.
  • you will be able to open TCP tunnels to connect from your machine to services running inside Grid'5000
  • you will be able to avoid man-in-the middle warnings, host key checking (in the FAQ#SSH_related_questions)