Advanced KaVLAN: Difference between revisions

From Grid5000
Jump to navigation Jump to search
 
(98 intermediate revisions by 15 users not shown)
Line 4: Line 4:
{{Portal|Network}}
{{Portal|Network}}
{{Status|In_production}}
{{Status|In_production}}
{{See also|[[Network_isolation_on_Grid'5000| Tutorial]] | [[KaVLAN_Admin]] | [[KaVLAN| KaVLAN]] | [[Kavlan specs|Specs]] | [[Kavlan multisite|Multisite Specs]] }}
{{Pages|KaVLAN}}
{{TutorialHeader}}
__FORCETOC__
__FORCETOC__
= Overview =
= Overview =
The goal of Kavlan is to provide network isolation for Grid'5000 users. KaVLAN allow users to manage VLAN on their Grid'5000 nodes. The benefits is complete level 2 isolation. It can be used together with OAR and Kadeploy to do some experimentations on the grid.
The goal of Kavlan is to provide network isolation for Grid'5000 users. KaVLAN allow users to manage VLAN on their Grid'5000 nodes. The benefits is complete level 2 isolation. It can be used together with OAR and Kadeploy to do some experimentations on the platform.


The first step is to read the [[KaVLAN]] introduction.
The first step is to read the [[KaVLAN]] introduction to understand what kind of VLANs you can configure.


= Use Open-MX with KaVLAN =
If you want a more concrete example of what you can do with VLANs on Grid'5000, you can go through the [[Network_reconfiguration_tutorial]].


In the first part of the tutorial, we will use kadeploy and kavlan together on a single site.
= Reserve VLANs and deploy nodes inside =


[http://open-mx.gforge.inria.fr/ Open-MX] is a high-performance implementation of the Myrinet Express message-passing stack over generic Ethernet networks.
In the first part of the tutorial, we will use kadeploy and kavlan together on a single site, with a routed vlan (we could also use a local vlan).


KaVLAN let several users use simultaneously Open-MX on a Grid'5000 site without interfering. For this, we will use a routed vlan (we could also use a local vlan).
Once connected on a frontend, in order to obtain nodes and a VLAN you must reserve a kavlan resource with <code class="command">oarsub</code>. There are 3 kinds of resources: '''kavlan''', '''kavlan-local''', '''kavlan-global'''. Here, we will use 3 nodes and a routed VLAN, let's say in Sophia on cluster Suno:
 
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -t deploy -l {"type='kavlan'"}/vlan=1,{"cluster='<code class="replace">suno</code>'"}/nodes=3 -I }}
First, you have to connect to one of the following sites: sophia, lille and lyon, nancy, rennes, toulouse (reims should be available soon).
 
To obtain nodes and a VLAN, you must reserve a kavlan resources with <code class="command">oarsub</code>. There are 3 kinds of resources: '''kavlan''', '''kavlan-local''', '''kavlan-global'''. Here, we will use 3 nodes and a routed VLAN:
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -t deploy -l {"type='kavlan'"}/vlan=1+/nodes=3 -I}}


A shell is now opened on the frontend (like any regular deploy job)
A shell is now opened on the frontend (like any regular deploy job)
You can get the id of your VLAN using the <code class="command">kavlan</code> command
You can get the id of your VLAN using the <code class="command">kavlan</code> command:
{{Term|location=frontend|cmd=<code class="command">kavlan</code> -V}}
{{Term|location=frontend|cmd=<code class="command">kavlan</code> -V}}


Line 31: Line 28:
{{Term|location=frontend|cmd=<code class="command">kavlan</code> -V -j <code class="replace">JOBID</code>}}
{{Term|location=frontend|cmd=<code class="command">kavlan</code> -V -j <code class="replace">JOBID</code>}}


You should get an integer in the <4-9> range for this routed VLAN ( the range for local vlan is <1-3>, and there is one global VLAN per OAR server).
You should get an integer in the <4-9> range for this routed VLAN (the range for local vlan is <1-3>, and there is one global VLAN per OAR server, i.e. one per site).
 
For our example, let's say we got suno-2, suno-30 and suno-31, and kavlan #4.


You can get all the options of the command using --help:
You can get all the options of the command using --help:
Line 61: Line 60:
Once you have a kavlan reservation running, you can put your nodes in your VLAN (and back into the default VLAN) at anytime during the lifetime of your job; we will not use this for now.
Once you have a kavlan reservation running, you can put your nodes in your VLAN (and back into the default VLAN) at anytime during the lifetime of your job; we will not use this for now.


Instead we will change the vlan with kadeploy. The next step is to deploy the nodes with an Open-MX aware image.
Instead we will change the VLAN with kadeploy directly. The next step is to deploy the nodes with an environment image, for instance debian11-x64-big.




Line 73: Line 72:
== Deploy nodes and change VLAN in one step ==
== Deploy nodes and change VLAN in one step ==


{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -f $OAR_NODEFILE -k -e debian11-big --vlan `kavlan -V`}}


{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -f $OAR_NODEFILE -k -a http://public.sophia.grid5000.fr/~nniclausse/openmx.dsc --vlan <code class="replace">4</code> }}
Once the deployment is done, you will be able to connect on your nodes. They are now inside the VLAN, therefore they are not reachable with their default IP:


Once the deployment is done, you will be able to connect on your nodes. They are now inside the VLAN, therefore there are not reachable with their default IP;
{{Term|location=frontend|cmd=<code class="command">ping</code> <code class="replace">suno-30</code> -c1}}
You can get the list of new hostname of you nodes in the vlan with <code class='command'> kavlan -l</code>:
<pre>
PING suno-30.sophia.grid5000.fr (172.16.130.30) 56(84) bytes of data.
From fsophia.sophia.grid5000.fr (172.16.143.106) icmp_seq=1 Destination Host Unreachable


Create a nodefile and copy it on the first node:
--- suno-30.sophia.grid5000.fr ping statistics ---
<pre class="brush: bash">
1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
kavlan -l > nodefile
scp nodefile root@`head -1 < nodefile`:/tmp
</pre>
</pre>


The password required is here
You can get the list of new hostnames of your nodes in the VLAN with <code class='command'>kavlan -l</code>. For the next part of this tutorial, let's create a nodefile and copy it on the first node:
<pre class="brush: bash"> Password : grid5000 </pre>


== Use Open-MX ==
{{Term|location=frontend|cmd=<code class="command">kavlan</code> -l &#124; <code class="command">tee</code> nodefile}}
 
<pre>
Once the deployement is done, you can connect to a node. The nodes are now in the vlan, therefore they can be reached at the address:
suno-2-kavlan-4.sophia.grid5000.fr
node-XX-kavlan-<code class='replace'>vlanid</code>
suno-30-kavlan-4.sophia.grid5000.fr
 
suno-31-kavlan-4.sophia.grid5000.fr
Note on Open-MX configuration: it was compiled with a MTU of 1500 instead of the default 9000 because jumbo frames are not configured on all Grid'5000 routers and switches.
 
 
Now connect on the first node:
{{Term|location=frontend|cmd=<code class="command">ssh</code> root@`kavlan -l &#124; head -1` }}
 
and run omx_info; you should see your 3 nodes:
<pre class="brush: bash">
suno-6-kavlan-4:~# /opt/open-mx/bin/omx_info
Open-MX version 1.3.4
build: root@suno-30-kavlan-18.sophia.grid5000.fr:/tmp/open-mx-1.3.4 Wed Mar  9 16:23:47 CET 2011
 
Found 1 boards (32 max) supporting 32 endpoints each:
suno-6-kavlan-4.sophia.grid5000.fr:0 (board #0 name eth0 addr 00:26:b9:3f:40:af)
  managed by driver 'bnx2'
  WARNING: high interrupt-coalescing
 
Peer table is ready, mapper is 00:00:00:00:00:00
================================================
  0) 00:26:b9:3f:40:af suno-6-kavlan-4.sophia.grid5000.fr:0
  1) 00:26:b9:3f:43:a1 suno-7-kavlan-4.sophia.grid5000.fr:0
  2) 00:26:b9:3f:4a:3d suno-8-kavlan-4.sophia.grid5000.fr:0
</pre>
</pre>
{{Term|location=frontend|cmd=<code class="command">scp</code> nodefile root@`head -1 < nodefile`:/tmp}}


The password for user root on Grid'5000 environments is "grid5000".


You can try to run an mpi application with mx support: netpipe
You can see that you can ping these new hostnames:
<pre class="brush: bash">
{{Term|location=frontend|cmd=<code class="command">ping</code> <code class="replace">suno-30-kavlan-4</code> -c1}}
su - mpi
<pre>
export LD_LIBRARY_PATH=/var/mpi/openmpi/lib/:/opt/open-mx/lib/
PING suno-30-eth0-kavlan-4.sophia.grid5000.fr (10.32.3.30) 56(84) bytes of data.
cd NetPIPE-3.7.1
64 bytes from suno-30-eth0-kavlan-4.sophia.grid5000.fr (10.32.3.30): icmp_seq=1 ttl=63 time=0.151 ms
mpirun -n 2 -x LD_LIBRARY_PATH  --machinefile /tmp/nodefile --mca btl self,sm,mx ./NPmpi
  0:      1 bytes  4627 times -->     0.37 Mbps in      20.36 usec
  1:      2 bytes  4910 times -->      0.75 Mbps in      20.31 usec
  2:      3 bytes  4924 times -->      1.13 Mbps in      20.33 usec
...
121: 8388605 bytes      3 times -->    911.48 Mbps in  70215.34 usec
122: 8388608 bytes     3 times -->    911.30 Mbps in  70229.01 usec
123: 8388611 bytes      3 times -->    911.20 Mbps in  70237.00 usec
</pre>


compare with the result without mx (tcp):
--- suno-30-eth0-kavlan-4.sophia.grid5000.fr ping statistics ---
<pre class="brush: bash">
1 packets transmitted, 1 received, 0% packet loss, time 0ms
mpirun -n 2 -x LD_LIBRARY_PATH  --machinefile /tmp/nodefile --mca btl self,sm,tcp ./NPmpi
rtt min/avg/max/mdev = 0.151/0.151/0.151/0.000 ms
  0:      1 bytes  2789 times -->      0.28 Mbps in      27.19 usec
  1:      2 bytes  3678 times -->      0.58 Mbps in      26.23 usec
  2:      3 bytes  3812 times -->      0.87 Mbps in      26.38 usec
...
121: 8388605 bytes      3 times -->    903.15 Mbps in  70862.85 usec
122: 8388608 bytes      3 times -->    902.94 Mbps in  70879.18 usec
123: 8388611 bytes      3 times -->    903.05 Mbps in  70870.68 usec
</pre>
</pre>
Without any tuning on the ethernet driver, the latency is improved by 23%, and the maximum bandwitdh is slighty better with Open-MX.


= Setup a DHCP server on your nodes =
= Setup a DHCP server on your nodes =
Line 154: Line 115:
Then, go back the the frontend, and download the script that will generate your dhcp configuration:
Then, go back the the frontend, and download the script that will generate your dhcp configuration:


{{Term|location=frontend|cmd=<code class="command">wget</code> http://public.sophia.grid5000.fr/~nniclausse/gen_dhcpd_conf.rb}}
Create this file (gen_dhcpd_conf.rb) on the frontend :
 
#!/usr/bin/ruby
# Author: Nicolas Niclausse
# Copyright 2010-2011: INRIA
# script specific to grid5000:
# generate dhcpd config files for kavlan
require 'rubygems'
require 'restfully' # gem install restfully --source http://gemcutter.org
require 'ip' # gem install ruby-ip
require 'getoptlong'
require 'optparse'
require 'ostruct'
headers = "ddns-update-style none;
option space pxelinux;
option pxelinux.magic      code 208 = string;
option pxelinux.configfile code 209 = text;
option pxelinux.pathprefix code 210 = text;
option pxelinux.reboottime code 211 = unsigned integer 32;
option vendorinfo          code 43  = string;
"
conf = File.expand_path('~/.restfully/api.grid5000.fr.yaml')
options = if FileTest.exists?(conf) then YAML.load_file(conf) else {} end
options[:base_uri] = 'https://api.grid5000.fr/stable/grid5000'
def parseopts(args)
  options = OpenStruct.new
  options.debug = false
  options.verbose = false
  options.quiet = false
  options.nodes = []
  opts = OptionParser.new do |opts|
    opts.banner = "Usage: gen_dhcpd_conf.rb [options]"
    opts.separator ""
    opts.separator "Specific options:"
    opts.on("-s","--site SITE",  "generate only DHCP conf for site SITE") do |site|
      options.site = site
    end
    opts.on("-i","--vlan-id N", Integer , "generate only DHCP conf for vlan N") do |vlan|
      options.vlan = vlan
    end
    opts.on("-q", "--[no-]quiet", "Run quietly") do |q|
      options.quiet = q
    end
    opts.on("-v", "--[no-]verbose", "Run verbosely") do |v|
      options.verbose = v
    end
    opts.on_tail("-h", "--help", "Show this message") do
      puts opts
      exit
    end
  end
  opts.parse!(args)
  options
end
$opts = parseopts(ARGV)
Restfully::Session.new(options) do |root, session|
  options = {:query => {:version => root['version']}}
  root.sites(options).each do |site|
    mysite=site['uid']
    next if not $opts.site.nil? and mysite != $opts.site
    # optionaly, read mac address from external yaml file
    ref = if FileTest.exists?(mysite+".yaml") then
            YAML.load_file(mysite+".yaml")
          else
            puts mysite +": no yaml file for macs" unless $opts.quiet
            {}
          end
    if $opts.vlan.nil? then
      vlans = (1..9).to_a
      # try to guess global vlan assigned to current site
      (10..21).each do |gvlan|
        begin
          IPSocket::getaddress("gw-kavlan-"+gvlan.to_s+"."+mysite+".grid5000.fr")
          puts "global vlan found for site %s: " % mysite unless $opts.quiet
          vlans.push(gvlan)
        rescue
          next
        end
      end
    else
      vlans = [$opts.vlan]
    end
    vlans.each do |vlan|
      filename = "dhcpd-kavlan-"+vlan.to_s+"-"+mysite+".conf"
      open(filename, 'w') do |f|
        puts "generating "+filename unless $opts.quiet
        f.puts headers
        begin
          gateway = IPSocket::getaddress("gw-kavlan-"+vlan.to_s+"."+mysite+".grid5000.fr")
        rescue
          puts "WARN: Get address error: probably no kavlan DNS setup for site " + mysite + " , skip" if $opts.verbose;
          next
        end
        # /20 for local vlans (1..3) and /18 for routed vlan (4..9)
        if vlan < 4
          ip = IP.new(gateway+"/20")
          ns = gateway
          ntp = gateway
          tftp = gateway
        else
          ip = IP.new(gateway+"/18")
          ntp = IPSocket::getaddress("ntp."+mysite+".grid5000.fr")
          ns = IPSocket::getaddress("dns."+mysite+".grid5000.fr")
          tftp = IPSocket::getaddress("kadeploy-server."+mysite+".grid5000.fr")
        end
        netmask = ip.netmask.to_addr
        broadcast = ip.broadcast.to_addr
        network = ip.network.to_addr
        f.puts "subnet %s netmask %s {" %  [network , netmask]
        f.puts "    default-lease-time 86400;
    max-lease-time 604800;"
        #f.puts "    option domain-name \"%s.grid5000.fr\"; " % mysite
        f.puts "    option domain-name-servers %s;" % ns
        f.puts "    option ntp-servers %s; " % ntp
        f.puts "    option routers %s;" % gateway
        f.puts "    option subnet-mask %s; " % netmask
        f.puts "    option broadcast-address %s;" % broadcast
        f.puts "    filename  \"pxelinux.0\";"
        f.puts "    next-server %s;" % tftp
   
        sites_for_vlan = if vlan < 10
                            [ site ]
                          else
                            root.sites(options)
                          end
        sites_for_vlan.each do |currentsite|
          currentsite.clusters(options).each do |cluster|
            cluster.nodes(options).each do |node|
              sitename=currentsite['uid']
              device = node['network_adapters'].find{|s| s['network_address'] =~ /^\w+-\d+\.\w+\.grid5000\.fr/}
              next if device.nil?
              hostname = device['network_address']
              next if hostname.nil?
              hostname_vlan = hostname.gsub(/^(\w+-\d+)(\..*)$/){$1+"-kavlan-"+vlan.to_s+$2}
              shortname_vlan = hostname_vlan.gsub(/^(\w+-\d+-\w+-\d+)(\..*)$/){$1}
              shortname = hostname.gsub(/^(\w+-\d+)(\..*)$/){$1}
              realsite = hostname.split(".")[1]
              begin
                vlan_ip = IPSocket::getaddress(hostname_vlan)
              rescue
                puts "WARN: Get address error: probably no DNS setup for vlan " +vlan.to_s+" on  site " + sitename + " , skip" if $opts.verbose;
                next
              end
              if device['mac'].nil? then
                if ref[shortname].nil? then
                  puts "WARN: mac undefined for host %s, skip" % hostname unless $opts.quiet
                  next
                else
                  mac = ref[shortname]['mac_eth0']
                end
              else
                mac = device['mac']
              end
              f.puts "  host %s {" % hostname_vlan
              f.puts "    hardware ethernet %s; " % mac
              f.puts "    option host-name \"%s\";" %  shortname_vlan
              f.puts "    option domain-name \"%s\.grid5000.fr\";" % realsite
              f.puts "    fixed-address %s;" % vlan_ip
              if vlan > 9
                # for global vlan, we need the local tftp server
                currenttftp = IPSocket::getaddress("kadeploy-server."+sitename+".grid5000.fr")
                f.puts "    next-server %s;" % currenttftp
              end
              f.puts "  }"
            end
          end
        end
        f.puts "}"
      end
    end
  end
end


(this script use <code>restfully</code> and <code>ruby-ip</code> gems)
(this script use <code>restfully</code> and <code>ruby-ip</code> gems)


Then, generate the configuration (replace <code class="replace">VLANID</code> and <code class="replace">SITE</code> by your current site and VLAN id), and copy it on the node:


Then, generate the configuration (replace <SITE> and <VLANID> by your current site and vlan id ) and copy it on the node:
{{Term|location=frontend|cmd=<code class="command">chmod</code> +x ./gen_dhcpd_conf.rb}}
<pre class="brush: bash">
{{Term|location=frontend|cmd=<code class="command">gem</code> install ruby-ip restfully --no-ri --no-rdoc --user-install}}
chmod +x ./gen_dhcpd_conf.rb
{{Term|location=frontend|cmd=<code class="command">./gen_dhcpd_conf.rb</code> --site <code class="replace">SITE</code> --vlan-id <code class="replace">VLANID</code>}}
gem install ruby-ip --no-ri --no-rdoc
{{Term|location=frontend|cmd=<code class="command">scp</code> dhcpd-kavlan-<code class="replace">VLANID</code>-<code class="replace">SITE</code>.conf root@`head -1 < nodefile`:}}
./gen_dhcpd_conf.rb --site <SITE> --vlan-id <VLANID>
scp dhcpd-kavlan-<VLANID>-<SITE>.conf root@node:/etc/dhcp3/dhcpd.conf}}
</pre>


For user accounts, you need to specify your GEM_HOME directory because in the classical one, you won't be able to install the "ruby-ip". To make it possible, type :
For user accounts, you need to specify your GEM_HOME directory because in the classical one, you won't be able to install the "ruby-ip". To make it possible, type :
<pre class="brush: bash">
{{Term|location=frontend|cmd=<code class="command">export</code> GEM_HOME=~/.gem/ruby/2.3.0/}}
export GEM_HOME=/home/<LOGIN>/.gem/ruby/1.8/
</pre>
where <LOGIN> is your own login.
 


You have to disable the default DHCP server of the VLAN:
You have to disable the default DHCP server of the VLAN:
On the frontend {{Term|location=frontend|cmd=<code class="command">kavlan -d</code>}}
On the frontend {{Term|location=frontend|cmd=<code class="command">kavlan -d</code>}}


Now you have to install a DHCP server on the node (we assume the node is not yet is the job VLAN, or the vlan is routed and have acces to the proxy for apt):
Now you have to install a DHCP server on the node (we assume the node is not yet in the job VLAN, or the vlan is routed and have access to internet for apt):
{{Term|location=node|cmd=<code class="command">apt-get</code> install dhcp3-server}}
{{Term|location=node|cmd=<code class="command">apt-get</code> install isc-dhcp-server}}
There may be an error after the installation : It's normal, you need to tell the DHCP server on which interface to listen to DHCP requests (replace "eno1" with the name of the interface on which the server should listen)
{{Term|location=node|cmd=<code class="command">sed</code> -i s/INTERFACESv4=\"\"/INTERFACESv4=\"eno1\"/g /etc/default/isc-dhcp-server}}


On the node choose as a DHCP server, start the server:
You can now copy the generated configuration file and start the DHCP server :
{{Term|location=node|cmd=<code class='command'>/etc/init.d/dhcp3-server</code> start}}
{{Term|location=node|cmd=<code class="command">cp</code> /root/dhcpd-kavlan-<code class="replace">VLANID</code>-<code class="replace">SITE</code>.conf /etc/dhcp/dhcpd.conf}}
{{Term|location=node|cmd=/etc/init.d/isc-dhcp-server restart}}


Then, in another shell, connect as root on a second node (or use kaconsole):
Then, in another shell, connect as root on a second node (or use kaconsole):
Line 187: Line 324:


And restart the network configuration:
And restart the network configuration:
<pre class="brush: bash">
 
suno-7-kavlan-4:~# /etc/init.d/networking restart
{{Term|location=node-dhcp-client|cmd=<code class='command'>systemctl restart</code> networking}}
...
 
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 6
{{Term|location=node-dhcp-client|cmd=<code class='command'>systemctl status</code> networking}}
[ 5185.656817] bnx2: eth0 NIC Copper Link is Up, 1000 Mbps full duplex, receive & transmit flow control ON
...
[ 5185.670596] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr dhclient[2633]: DHCPREQUEST of 10.32.3.7 on eno1 to 255.255.255.255 port 67
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 8
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: DHCPDISCOVER on eno1 to 255.255.255.255 port 67 interval 10
DHCPOFFER from 10.32.3.6
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: DHCPREQUEST of 10.32.3.7 on eno1 to 255.255.255.255 port 67
DHCPREQUEST on eth0 to 255.255.255.255 port 67
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: DHCPOFFER of 10.32.3.7 from 10.32.3.6
DHCPACK from 10.32.3.6
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: DHCPACK of 10.32.3.7 from 10.32.3.6
Stopping NTP server: ntpd.
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr dhclient[2633]: DHCPOFFER of 10.32.3.7 from 10.32.3.6
Starting NTP server: ntpd.
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr dhclient[2633]: DHCPACK of 10.32.3.7 from 10.32.3.6
bound to 10.32.3.7 -- renewal in 37174 seconds.
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: bound to 10.32.3.7 -- renewal in 34620 seconds.
</pre>
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: Sending network state change signal to nslcd...done.
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr systemd[1]: Started Raise network interfaces.


on the dhcp server, check the logs:
on the dhcp server, check the logs:


<pre class="brush: bash">
{{Term|location=node-dhcp-server|cmd=<code class='command'>tail</code> /var/log/daemon.log}}
azur-25-kavlan-7:~# tail /var/log/daemon.log
...
Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPDISCOVER from 00:26:b9:3f:43:a1 via eth0
Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPDISCOVER from 00:26:b9:3f:43:a1 via eno1
Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPOFFER on 10.32.3.7 to 00:26:b9:3f:43:a1 via eth0
Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPOFFER on 10.32.3.7 to 00:26:b9:3f:43:a1 via eno1
Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPREQUEST for 10.32.3.7 (10.32.3.6) from 00:26:b9:3f:43:a1 via eth0
Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPREQUEST for 10.32.3.7 (10.32.3.6) from 00:26:b9:3f:43:a1 via eno1
Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPACK on 10.32.3.7 to 00:26:b9:3f:43:a1 via eth0
Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPACK on 10.32.3.7 to 00:26:b9:3f:43:a1 via eno1
</pre>


In the four last lines, you see that your own dhcp server has given an address to the other node.
In the four last lines, you see that your own dhcp server has given an address to the other node.
Line 218: Line 355:
For your information, if you need to do a PXE boot, you must change the tftp server in the generated dhcpd configuration file:
For your information, if you need to do a PXE boot, you must change the tftp server in the generated dhcpd configuration file:
{{Term|location=node|cmd=IP=`hostname -i`}}
{{Term|location=node|cmd=IP=`hostname -i`}}
{{Term|location=node|cmd=<code class='command'>perl</code> -i -pe "s/next-server .*/next-server $IP;/" /etc/dhcp3/dhcpd.conf}}
{{Term|location=node|cmd=<code class='command'>perl</code> -i -pe "s/next-server .*/next-server $IP;/" /etc/dhcp/dhcpd.conf}}
(if there is no <code>next-server</code> configured, you must edit the file by hand and add a line like this:
(if there is no <code>next-server</code> configured, you must edit the file by hand and add a line like this:
  next-server XX.XX.XX.XX ;
  next-server XX.XX.XX.XX ;
Line 235: Line 372:
First, we will use taktuk to install <code class='command'>at</code> on all nodes, then the taktuk command will simply launch the network reconfiguration in one minute. Finally, we set the VLAN of all our nodes.
First, we will use taktuk to install <code class='command'>at</code> on all nodes, then the taktuk command will simply launch the network reconfiguration in one minute. Finally, we set the VLAN of all our nodes.


<pre class="brush: bash">
As we will change the network configuration of nodes, we will use an isolated kavlan (a.k.a. [[KaVLAN#1:_Isolated_VLAN|kavlan-local]]) to not interfer with the rest of Grid'5000 network.
$ uniq $OAR_NODEFILE > ./mynodes
 
$ taktuk -s -l root -f ./mynodes broadcast exec [ "apt-get update; apt-get --yes install at" ]
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -t deploy -l {"type='kavlan-local'"}/vlan=1,walltime=2 -I}}
$ taktuk -s -l root -f ./mynodes broadcast exec [ "echo '/etc/init.d/networking restart'| at now + 1 minute " ]
{{Term|location=frontend|cmd=<code class="command">kavlan</code> -V &#124; <code class="command">tee</code> myvlan}}
$ kavlan -s
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -t deploy -l nodes=2 -I}}
Take node list from OAR nodefile: /var/lib/oar/387465
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e debian11-x64-base -k -f $OAR_FILE_NODES}}
... node azur-25.sophia.grid5000.fr changed to vlan KAVLAN-7
{{Term|location=frontend|cmd=<code class="command">taktuk</code> -s -l root -f $OAR_FILE_NODES broadcast exec [ "apt-get update; apt-get --yes install at" ]}}
... node azur-28.sophia.grid5000.fr changed to vlan KAVLAN-7
{{Term|location=frontend|cmd=<code class="command">taktuk</code> -s -l root -f $OAR_FILE_NODES broadcast exec [ "echo '/etc/init.d/networking restart'&#124; at now + 1 minute " ]}}
... node azur-30.sophia.grid5000.fr changed to vlan KAVLAN-7
{{Term|location=frontend|cmd=<code class="command">kavlan</code> -i `cat myvlan` -s -f $OAR_FILE_NODES}}
all nodes are configured in the vlan 7
</pre>


In one minute, your nodes will renegotiate their IP addresses and will be available inside the VLAN. To get the name of your nodes in the VLAN, use the ''-l'' option:
All nodes are configured in the vlan 2.
<pre class="brush: bash">
In one minute, your nodes will renegotiate their IP addresses and will be available inside the VLAN, you can connect to each of them using kaconsole or ssh (as we use a kavlan-local, you must connect to the gateway of that kavlan first):
$kavlan -l
azur-25-kavlan-7.sophia.grid5000.fr
azur-28-kavlan-7.sophia.grid5000.fr
azur-30-kavlan-7.sophia.grid5000.fr
</pre>


You can connect to each of them using kaconsole or ssh (first, you must connect to the gateway of the vlan):
{{Term|location=frontend|cmd=<code class="command">ssh</code> kavlan-`cat myvlan`}}
<pre class="brush: bash">
{{Term|location=kavlan-VLANID|cmd=<code class="command">ssh</code> root@<code class="replace">suno-30</code>-kavlan-`cat myvlan`}}
$VLANID=`kavlan -V`
$ssh kavlan-$VLANID
kavlan-7@sophia$ ssh root@azur-25-kavlan-7
</pre>


You can use the <code class='command'>ip neigh</code> command to see the known hosts in your LAN; you should only see IPs in the 192.168.66.0/24 subnet
You can use the <code class='command'>ip neigh</code> command to see the known hosts in your LAN; you should only see IPs in the 192.168.66.0/24 subnet
<pre class="brush: bash">
 
azur-25-kavlan-7:~$ip neigh
{{Term|location=node|cmd=<code class="command">ip</code> neigh}}
192.168.66.250 dev eth0 INCOMPLETE
 
192.168.66.254 dev eth0 lladdr 02:00:00:00:01:02 REACHABLE
  192.168.223.254 dev eno1 lladdr 02:00:00:00:01:02 REACHABLE
</pre>


You should be able to ping another of your host inside your VLAN
You should be able to ping another of your host inside your VLAN
<pre class="brush: bash">
{{Term|location=node|cmd=<code class="command">ping</code> -c 3 suno-42-kavlan-2}}
azur-25-kavlan-7:~# ping -c 3 azur-30-kavlan-7
64 bytes from 192.168.211.42: icmp_req=1 ttl=64 time=0.141 ms
PING azur-30-kavlan-7.sophia.grid5000.fr (192.168.66.30) 56(84) bytes of data.
64 bytes from 192.168.211.42: icmp_req=2 ttl=64 time=0.166 ms
64 bytes from azur-30.local (192.168.66.30): icmp_seq=1 ttl=64 time=0.154 ms
64 bytes from 192.168.211.42: icmp_req=3 ttl=64 time=0.165 ms
64 bytes from azur-30.local (192.168.66.30): icmp_seq=2 ttl=64 time=0.170 ms
64 bytes from azur-30.local (192.168.66.30): icmp_seq=3 ttl=64 time=0.163 ms
--- suno-42-kavlan-2.sophia.grid5000.fr ping statistics ---
 
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
--- azur-30-kavlan-7.sophia.grid5000.fr ping statistics ---
rtt min/avg/max/mdev = 0.141/0.157/0.166/0.015 ms
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.154/0.162/0.170/0.012 ms
</pre>


== Put your nodes back into the default VLAN ==
== Put your nodes back into the default VLAN ==


First, get put the list of your nodes name with vlan in a file:
First, get put the list of your nodes name with vlan in a file:
{{Term|location=frontend|cmd=<code class='command'>kavlan</code> -l > mynodes-vlan}}
{{Term|location=frontend|cmd=<code class='command'>uniq </code> $OAR_NODEFILE > mynodes}}
{{Term|location=frontend|cmd=<code class='command'>sed</code> "s/\([^.]*\)\(.*\)/\1-kavlan-`cat myvlan`\2/" mynodes > mynodes-vlan}}


Don't forget to first start the network restarting command with taktuk:
Don't forget to first start the network restarting command, but this time, we need to run ssh from the kavlan gateway, and not the frontend :
{{Term|location=frontend|cmd=<code class='command'>taktuk</code> -s -l root -f ./mynodes-vlan broadcast exec [ "echo '/etc/init.d/networking restart' &#124;  at now + 1 minute " ]}}
{{Term|location=frontend|cmd=<code class='command'>ssh</code> kavlan-<code class="replace">VLANID</code>}}
 
{{Term|location=kavlan-VLANID|cmd=for NODE in $(cat mynodes-vlan); do <code class='command'>ssh</code> root@$NODE "echo '/etc/init.d/networking restart' &#124;  at now + 1 minute "; done;}}


Then you can put your nodes back in the default VLAN:
Then you can put your nodes back in the default VLAN:
Line 294: Line 419:


You should be able to ping your nodes:
You should be able to ping your nodes:
<pre class="brush: bash">
{{Term|location=frontend|cmd=<code class="command">for</code> i in `uniq $OAR_NODEFILE`; do ping -c 1 $i; done}}
for i in `uniq $OAR_NODEFILE`; do ping -c 1 $i; done
 
PING azur-25.sophia.grid5000.fr (138.96.20.25 56(84) bytes of data.
 
64 bytes from azur-25.sophia.grid5000.fr (138.96.20.25): icmp_seq=1 ttl=64 time=1002 ms
Another way to put back nodes into the default VLAN is to change the vlan and then kareboot the nodes.
 
{{Term|location=frontend|cmd=<code class='command'>kavlan</code> -s -i DEFAULT -f $OAR_NODEFILE}}
{{Term|location=frontend|cmd=<code class='command'>kareboot3</code> -f $OAR_NODEFILE -r simple}}
 
= KaVLAN VPN =
[[Image:G5K_kavlanvpn.png|400px|thumb]]


--- azur-25.sophia.grid5000.fr ping statistics ---
This feature allows users to build a Virtual Private Network (VPN) between a [[KaVLAN]] network and the outside world. Hence, it is possible to interconnect Grid'5000 nodes with any external network (from the user's laptop to the Internet), bypassing Grid'5000 network isolation.
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 1002.910/1002.910/1002.910/0.000 ms
PING azur-28.sophia.grid5000.fr (138.96.20.28) 56(84) bytes of data.
64 bytes from azur-28.sophia.grid5000.fr (138.96.20.28): icmp_seq=1 ttl=64 time=1.23 ms


--- azur-28.sophia.grid5000.fr ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 1.234/1.234/1.234/0.000 ms
PING azur-30.sophia.grid5000.fr (138.96.20.30) 56(84) bytes of data.
64 bytes from azur-30.sophia.grid5000.fr (138.96.20.30): icmp_seq=1 ttl=64 time=1.25 ms


--- azur-30.sophia.grid5000.fr ping statistics ---
{{Warning|text=This is an advanced feature. It requires a good understanding of [[KaVLAN]], VPNs and networking in Linux}}
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 1.259/1.259/1.259/0.000 ms
</pre>


Another way to put back nodes into the default VLAN is to change the vlan and then kareboot the nodes.


{{Term|location=frontend|cmd=<code class='command'>kavlan</code> -s -i DEFAULT -f $OAR_NODEFILE}}
Some information:
{{Term|location=frontend|cmd=<code class='command'>kareboot3</code> -f $OAR_NODEFILE -r simple_reboot}}
* This service currently uses SSH VPN at Layer-2 (Ethernet level). Since that VPN is built on top of TCP protocol, '''you should not expect high network performance'''.
* The VPN requires two end points (or gateway) to be interconnected. On Grid'5000 side, VPN gateways are installed on kavlan-{1,2,3}.<site>.grid5000.fr servers.
* On the user's side (outside of Grid'5000), a GNU/Linux system with root privileges is required, to act as the user's gateway.
* On Grid'5000, the user must reserve a '''non-routed local''' kavlan network (the VPN only works with this kind of kavlan). Grid'5000 nodes must be switched into that kavlan to be accessible through the VPN.




=Other usage=
The VPN is initiated from the user's gateway machine using a SSH connection to the appropriate kavlan-X server (which depends on kavlan network previously reserved). To enable VPN, SSH "-w" options must be used to connect to remote tap0 on kavlan-X server, with VPN tunnel configured in Ethernet mode. See ssh and ssh_config manpages for more information about those options.


==Using the API==


Kavlan is also available through the API. Using the job and deploy API, you can, as with the command line tools, reverve nodes with vlan and deploy nodes into a vlan. If you want to manipulate VLAN directly through the API, you can do several things:
Example, with KaVLAN network "1" at lyon :


You can get the vlans you have reserved:
* As root, create a virtual tap device that will be connected to your kavlan using SSH VPN. (Replace $USERNAME by your user name)
  <code class="host">laptop: </code>sudo ip tuntap add dev tap<code class="replace">0</code> mode tap user <code class="replace">$USERNAME</code>
* Assign an IP address to this interface.
  <code class="host">laptop: </code>sudo ifconfig tap<code class="replace">0</code> 192.168.207.253/20


GET https://api.grid5000.fr/sid/grid5000/sites/:site_uid/vlans/users/:user_uid
{{Warning|text=The IP address you choose must be inside the kavlan network, which depends on the kavlan number you are using. See [[Grid5000:Network#KaVLAN_networks]]}}


You can get all the vlan available on a site:
* Start the SSH VPN
GET https://api.grid5000.fr/sid/grid5000/sites/:site_uid/vlans/
  <code class="host">laptop: </code>ssh -o Tunnel=ethernet -w <code class="replace">0</code>:0 -N kavlan-1.lyon.g5k
If the command runs correctly, it should not output anything.
* Options description:
** ''-o Tunnel=ethernet'': Use an ethernet (layer 2) VPN
** ''-w <code class="replace">0</code>:0'': Use interface tap<code class="replace">0</code> on client side (first <code class="replace">0</code>) and tap0 on server side (second 0, mandatory)
** ''-N''                  : Do not execute a remote command
** ''kavlan-1.lyon.g5k'': Connect to lyon kavlan-1 gateway. Trailing .g5k assumes that you appropriately configured your ssh_config to connect to Grid5000 nodes using .g5k extension
* Client's tap<code class="replace">0</code> interface is now connected to the kavlan network. You should be able to ping other nodes inside this network.
  <code class="host">laptop: </code>ping 192.168.192.83
  PING 192.168.192.83 (192.168.192.83) 56(84) bytes of data.
  64 bytes from 192.168.192.83: icmp_req=1 ttl=64 time=82.7 ms
  64 bytes from 192.168.192.83: icmp_req=2 ttl=64 time=39.9 ms
  ...


You can get the VLAN of nodes on a site:
{{Warning|text=DNS hostname resolution cannot be used here, as DNS servers are inside Grid'5000 network and this command is executed from your local workstation}}
GET https://api.grid5000.fr/sid/grid5000/sites/:site_uid/vlans/nodes


You can print the VLAN of a list of nodes
=Other usage=
POST https://api.grid5000.fr/sid/grid5000/sites/:site_uid/vlans/nodes {\"nodes\": [ <list of node names>]}


You can change the VLAN of a list of nodes:
==Using the API==
POST https://api.grid5000.fr/sid/grid5000/sites/:site_uid/vlans/:vlan_uid/ {\"nodes\": [ <list of node names>]}


You can start the dhcp server for the vlan
Kavlan is also available through the API. Using the job and deploy API, you can, as with the command line tools, reverve nodes with vlan and deploy nodes into a vlan.
PUT https://api.grid5000.fr/sid/grid5000/sites/:site_uid/vlans/:vlan_uid/dhcpd


You can stop the dhcp server for the vlan
See [[API tutorial#Vlans_API|Vlans API tutorial]] and [https://api.grid5000.fr/doc/stable/#tag/vlan Vlans API speficiation]
DELETE https://api.grid5000.fr/sid/grid5000/sites/:site_uid/vlans/:vlan_uid/dhcpd


== Use a global VLAN  ==
== Use a global VLAN  ==
Line 360: Line 490:
Get the oargrid Id and Job key from the output of oargridsub:
Get the oargrid Id and Job key from the output of oargridsub:
{{Term|location=frontend|cmd=<code class="command">export</code> OAR_JOB_KEY_FILE=`grep "SSH KEY" oargrid.out &#124; cut -f2 -d: &#124; tr -d " "`}}
{{Term|location=frontend|cmd=<code class="command">export</code> OAR_JOB_KEY_FILE=`grep "SSH KEY" oargrid.out &#124; cut -f2 -d: &#124; tr -d " "`}}
{{Term|location=frontend|cmd=<code class="command">export</code> OARGRID_JOB_ID=`grep "Grid reservation id" oargrid.out &#124; cut -f2 -d=`}}
{{Term|location=frontend|cmd=<code class="command">export</code> OARGRID_JOB_ID=`grep "Grid reservation id" oargrid.out &#124; cut -f2 -d= &#124; cut -d ' ' -f2`}}
Get the node list using oargridstat:
Get the node list using oargridstat:
{{Term|location=frontend|cmd=<code class="command">oargridstat</code> -w -l $OARGRID_JOB_ID  &#124; grep grid> ~/gridnodes}}
{{Term|location=frontend|cmd=<code class="command">oargridstat</code> -w -l $OARGRID_JOB_ID  &#124; grep grid> ~/gridnodes}}


Then use kadeploy3 to deploy your image on all sites and change the VLAN:
Then use kadeploy3 to deploy your image on all sites and change the VLAN:
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -f gridnodes -a http://public.sophia.grid5000.fr/~nniclausse/openmx.dsc -k --multi-server -o ~/nodes.deployed --vlan <code class="replace">13</code>}}
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -f gridnodes -a http://public.sophia.grid5000.fr/~nniclausse/openmx.dsc -k --multi-server -o ~/nodes.deployed --vlan <code class="replace">18</code>}}


If you want to manipulate directly VLAN of a node, you have to run the kavlan command on the site where the node is, e.g. if you have reserved the global vlan located at sophia and want to put some nodes of lille into this vlan, you have to run kavlan on lille site (or use the API with lille site in the URL).
If you want to manipulate directly VLAN of a node, you have to run the kavlan command on the site where the node is, e.g. if you have reserved the global vlan located at sophia and want to put some nodes of lille into this vlan, you have to run ''kavlan -m nodename -i VLAN_GLOBAL_ID -s'' on lille site (or use the API with lille site in the URL).


== How to use a local VLAN ==
== How to use a local VLAN ==
Line 375: Line 505:
If you want to use local VLAN, you have to first connect on the gateway of the vlan. For this, once you have a running reservation on a local VLAN, you have a ssh accces to the gateway:
If you want to use local VLAN, you have to first connect on the gateway of the vlan. For this, once you have a running reservation on a local VLAN, you have a ssh accces to the gateway:


ssh kavlan-<vlanid>
{{Term|location=frontend|cmd=<code class="command">ssh</code> kavlan-<vlanid>}}


Then you can reach your nodes inside the VLAN. Another option is to use the kaconsole command.
Then you can reach your nodes inside the VLAN. Another option is to use the kaconsole command.
Line 399: Line 529:
Then you can simply use ssh <cluster>-<nodeid>-kavlan-<vlanid> to access the node , for example:
Then you can simply use ssh <cluster>-<nodeid>-kavlan-<vlanid> to access the node , for example:
{{Term|location=frontend|cmd=<code class="command">ssh</code> root@<code class='replace'>NODE</code>-kavlan-<code class='replace'>VLANID</code>}}
{{Term|location=frontend|cmd=<code class="command">ssh</code> root@<code class='replace'>NODE</code>-kavlan-<code class='replace'>VLANID</code>}}
== A simple multi NICs example ==
We show here how to reserve and configure multiple Ethernet network interfaces.
First we reserve a deploy job, with 2 nodes and 2 vlans:
{{Term|location=frontend|cmd=<code class="command">oarsub</code> oarsub -I -t deploy -l {"eth_count > 1 and cluster = '<code class="replace">cluster_name</code>'"}/nodes=2,{"type='kavlan'"}/vlan=2,walltime=02:00:00}}
Then we deploy the wanted environment:
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -f $OAR_NODEFILE -k -e debian11-x64-nfs}}
See cluster section to know which Ethernet interfaces can be used. For exemple, on paranoia (Rennes), eth1/enp3s0f1 and eth2/eno1 are cabled. Use <code class="replace">I</code>=1 and <code class="replace">J</code>=2 for paranoia.
Get node name with interfaces:
{{Term|location=frontend|cmd=<code class="command">uniq</code> $OAR_FILE_NODES &#124; sed -e 's/\([^\.]*\)\(.*\)/\1-eth<code class="replace">I</code>\2/' > nodes_second_int}}
{{Term|location=frontend|cmd=<code class="command">uniq</code> $OAR_FILE_NODES &#124; sed -e 's/\([^\.]*\)\(.*\)/\1-eth<code class="replace">J</code>\2/' > nodes_third_int}}
Show vlans number:
{{Term|location=frontend|cmd=<code class="command">kavlan</code> -V }}
Put interfaces on the two different vlan:
{{Term|location=frontend|cmd=<code class="command">kavlan</code> -i <code class="replace">vlan1</code> -s -f nodes_second_int}}
{{Term|location=frontend|cmd=<code class="command">kavlan</code> -i <code class="replace">vlan2</code> -s -f nodes_third_int}}
Get ip on second and third interface :
{{Term|location=frontend|cmd=<code class="command">uniq</code> $OAR_NODEFILE &#124; taktuk -d -1 -l root -f - broadcast exec [ 'dhclient <code class="replace">enp3s0f1</code>' ]}}
{{Term|location=frontend|cmd=<code class="command">uniq</code> $OAR_NODEFILE &#124; taktuk -d -1 -l root -f - broadcast exec [ 'dhclient <code class="replace">eno1</code>' ]}}
At this moment your node should have 3 IP:
{{Term|location=node|cmd=<code class="command">ip</code> a}}
root@paranoia-8:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
        valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
        valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether f0:4d:a2:73:ce:3d brd ff:ff:ff:ff:ff:ff
    inet <code class="replace">10.24.70.8/18</code> brd 10.24.127.255 scope global eno1
        valid_lft forever preferred_lft forever
    inet6 fe80::f24d:a2ff:fe73:ce3d/64 scope link
        valid_lft forever preferred_lft forever
3: eno2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether f0:4d:a2:73:ce:3e brd ff:ff:ff:ff:ff:ff
4: enp3s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether a0:36:9f:28:a9:18 brd ff:ff:ff:ff:ff:ff
    inet <code class="replace">172.16.100.8/20</code> brd 172.16.111.255 scope global enp3s0f0
        valid_lft forever preferred_lft forever
    inet6 fe80::a236:9fff:fe28:a918/64 scope link
        valid_lft forever preferred_lft forever
5: enp3s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether a0:36:9f:28:a9:1a brd ff:ff:ff:ff:ff:ff
    inet <code class="replace">10.24.7.8/18</code> brd 10.24.63.255 scope global enp3s0f1
        valid_lft forever preferred_lft forever
    inet6 fe80::a236:9fff:fe28:a91a/64 scope link
        valid_lft forever preferred_lft forever

Latest revision as of 13:59, 21 October 2022

Note.png Note

This page is actively maintained by the Grid'5000 team. If you encounter problems, please report them (see the Support page). Additionally, as it is a wiki page, you are free to make minor corrections yourself if needed. If you would like to suggest a more fundamental change, please contact the Grid'5000 team.

Overview

The goal of Kavlan is to provide network isolation for Grid'5000 users. KaVLAN allow users to manage VLAN on their Grid'5000 nodes. The benefits is complete level 2 isolation. It can be used together with OAR and Kadeploy to do some experimentations on the platform.

The first step is to read the KaVLAN introduction to understand what kind of VLANs you can configure.

If you want a more concrete example of what you can do with VLANs on Grid'5000, you can go through the Network_reconfiguration_tutorial.

Reserve VLANs and deploy nodes inside

In the first part of the tutorial, we will use kadeploy and kavlan together on a single site, with a routed vlan (we could also use a local vlan).

Once connected on a frontend, in order to obtain nodes and a VLAN you must reserve a kavlan resource with oarsub. There are 3 kinds of resources: kavlan, kavlan-local, kavlan-global. Here, we will use 3 nodes and a routed VLAN, let's say in Sophia on cluster Suno:

Terminal.png frontend:
oarsub -t deploy -l {"type='kavlan'"}/vlan=1,{"cluster='suno'"}/nodes=3 -I

A shell is now opened on the frontend (like any regular deploy job) You can get the id of your VLAN using the kavlan command:

Terminal.png frontend:
kavlan -V

If you run this command outside the shell started by OAR for your reservation, you must add the oar JOBID.

Terminal.png frontend:
kavlan -V -j JOBID

You should get an integer in the <4-9> range for this routed VLAN (the range for local vlan is <1-3>, and there is one global VLAN per OAR server, i.e. one per site).

For our example, let's say we got suno-2, suno-30 and suno-31, and kavlan #4.

You can get all the options of the command using --help:

# kavlan --help
Usage: kavlan [options]
Specific options:
    -i, --vlan-id N                  set VLAN ID (integer or DEFAULT)
    -C, --ca-cert CA                 CA certificate
    -c, --client-cert CERT           client certificate
    -k, --client-key KEY             client key
    -l, --get-nodelist               Show nodenames in the given vlan
    -e, --enable-dhcp                Start DHCP server
    -d, --disable-dhcp               Stop DHCP server
    -V, --show-vlan-id               Show vlan id of job (needs -j JOBID)
    -g, --get-vlan                   Show vlan of nodes
    -s, --set-vlan                   Set vlan of nodes
    -j, --oar-jobid JOBID            OAR job id
    -m, --machine NODE               set nodename (several -m are OK)
    -f, --filename NODEFILE          read nodes from a file
    -u, --user USERNAME              username
    -v, --[no-]verbose               Run verbosely
    -q, --[no-]quiet                 Run quietly
        --[no-]debug                 Run with debug output
    -h, --help                       Show this message
        --version                    Show version

Once you have a kavlan reservation running, you can put your nodes in your VLAN (and back into the default VLAN) at anytime during the lifetime of your job; we will not use this for now.

Instead we will change the VLAN with kadeploy directly. The next step is to deploy the nodes with an environment image, for instance debian11-x64-big.


Enable the dhcp server of the VLAN

Before deploying, if you don't install your own DHCP server, you should start the default DHCP server of the VLAN. Do this with the kavlan command (add -j JOBID if needed) :

Terminal.png frontend:
kavlan -e

(You can disable the DHCP server with kavlan -d)

Deploy nodes and change VLAN in one step

Terminal.png frontend:
kadeploy3 -f $OAR_NODEFILE -k -e debian11-big --vlan `kavlan -V`

Once the deployment is done, you will be able to connect on your nodes. They are now inside the VLAN, therefore they are not reachable with their default IP:

Terminal.png frontend:
ping suno-30 -c1
PING suno-30.sophia.grid5000.fr (172.16.130.30) 56(84) bytes of data.
From fsophia.sophia.grid5000.fr (172.16.143.106) icmp_seq=1 Destination Host Unreachable

--- suno-30.sophia.grid5000.fr ping statistics ---
1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms

You can get the list of new hostnames of your nodes in the VLAN with kavlan -l. For the next part of this tutorial, let's create a nodefile and copy it on the first node:

Terminal.png frontend:
kavlan -l | tee nodefile
suno-2-kavlan-4.sophia.grid5000.fr
suno-30-kavlan-4.sophia.grid5000.fr
suno-31-kavlan-4.sophia.grid5000.fr
Terminal.png frontend:
scp nodefile root@`head -1 < nodefile`:/tmp

The password for user root on Grid'5000 environments is "grid5000".

You can see that you can ping these new hostnames:

Terminal.png frontend:
ping suno-30-kavlan-4 -c1
PING suno-30-eth0-kavlan-4.sophia.grid5000.fr (10.32.3.30) 56(84) bytes of data.
64 bytes from suno-30-eth0-kavlan-4.sophia.grid5000.fr (10.32.3.30): icmp_seq=1 ttl=63 time=0.151 ms

--- suno-30-eth0-kavlan-4.sophia.grid5000.fr ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.151/0.151/0.151/0.000 ms

Setup a DHCP server on your nodes

Configure DHCP

If you need to run your own DHCP server (for example if you want to run a cluster distribution inside kavlan or test kadeploy ), you can use a script to generate the configuration file:

Then, go back the the frontend, and download the script that will generate your dhcp configuration:

Create this file (gen_dhcpd_conf.rb) on the frontend :

#!/usr/bin/ruby

# Author: Nicolas Niclausse
# Copyright 2010-2011: INRIA

# script specific to grid5000:
# generate dhcpd config files for kavlan

require 'rubygems'
require 'restfully' # gem install restfully --source http://gemcutter.org
require 'ip' # gem install ruby-ip
require 'getoptlong'
require 'optparse'
require 'ostruct'

headers = "ddns-update-style none;
option space pxelinux;
option pxelinux.magic      code 208 = string;
option pxelinux.configfile code 209 = text;
option pxelinux.pathprefix code 210 = text;
option pxelinux.reboottime code 211 = unsigned integer 32;
option vendorinfo          code 43  = string;
"

conf = File.expand_path('~/.restfully/api.grid5000.fr.yaml')
options = if FileTest.exists?(conf) then YAML.load_file(conf) else {} end
options[:base_uri] = 'https://api.grid5000.fr/stable/grid5000'

def parseopts(args)
  options = OpenStruct.new
  options.debug = false
  options.verbose = false
  options.quiet = false
  options.nodes = []
  opts = OptionParser.new do |opts|
    opts.banner = "Usage: gen_dhcpd_conf.rb [options]"
    opts.separator ""
    opts.separator "Specific options:"
    opts.on("-s","--site SITE",  "generate only DHCP conf for site SITE") do |site|
      options.site = site
    end
    opts.on("-i","--vlan-id N", Integer , "generate only DHCP conf for vlan N") do |vlan|
      options.vlan = vlan
    end
    opts.on("-q", "--[no-]quiet", "Run quietly") do |q|
      options.quiet = q
    end
    opts.on("-v", "--[no-]verbose", "Run verbosely") do |v|
      options.verbose = v
    end
    opts.on_tail("-h", "--help", "Show this message") do
      puts opts
      exit
    end
  end
  opts.parse!(args)
  options
end

$opts = parseopts(ARGV)

Restfully::Session.new(options) do |root, session|
  options = {:query => {:version => root['version']}}
  root.sites(options).each do |site|
    mysite=site['uid']
    next if not $opts.site.nil? and mysite != $opts.site
    # optionaly, read mac address from external yaml file
    ref = if FileTest.exists?(mysite+".yaml") then
            YAML.load_file(mysite+".yaml")
          else
            puts mysite +": no yaml file for macs" unless $opts.quiet
            {}
          end
    if $opts.vlan.nil? then
      vlans = (1..9).to_a
      # try to guess global vlan assigned to current site
      (10..21).each do |gvlan|
        begin
          IPSocket::getaddress("gw-kavlan-"+gvlan.to_s+"."+mysite+".grid5000.fr")
          puts "global vlan found for site %s: " % mysite unless $opts.quiet
          vlans.push(gvlan)
        rescue
          next
        end
      end
    else
      vlans = [$opts.vlan]
    end
    vlans.each do |vlan|
      filename = "dhcpd-kavlan-"+vlan.to_s+"-"+mysite+".conf"
      open(filename, 'w') do |f|
        puts "generating "+filename unless $opts.quiet
        f.puts headers
        begin
          gateway = IPSocket::getaddress("gw-kavlan-"+vlan.to_s+"."+mysite+".grid5000.fr")
        rescue
          puts "WARN: Get address error: probably no kavlan DNS setup for site " + mysite + " , skip" if $opts.verbose;
          next
        end
        # /20 for local vlans (1..3) and /18 for routed vlan (4..9)
        if vlan < 4
          ip = IP.new(gateway+"/20")
          ns = gateway
          ntp = gateway
          tftp = gateway
        else
          ip = IP.new(gateway+"/18")
          ntp = IPSocket::getaddress("ntp."+mysite+".grid5000.fr")
          ns = IPSocket::getaddress("dns."+mysite+".grid5000.fr")
          tftp = IPSocket::getaddress("kadeploy-server."+mysite+".grid5000.fr")
        end
        netmask = ip.netmask.to_addr
        broadcast = ip.broadcast.to_addr
        network = ip.network.to_addr
        f.puts "subnet %s netmask %s {" %  [network , netmask]
        f.puts "    default-lease-time 86400;
    max-lease-time 604800;"
        #f.puts "    option domain-name \"%s.grid5000.fr\"; " % mysite
        f.puts "    option domain-name-servers %s;" % ns
        f.puts "    option ntp-servers %s; " % ntp
        f.puts "    option routers %s;" % gateway
        f.puts "    option subnet-mask %s; " % netmask
        f.puts "    option broadcast-address %s;" % broadcast
        f.puts "    filename  \"pxelinux.0\";"
        f.puts "    next-server %s;" % tftp

        sites_for_vlan = if vlan < 10
                           [ site ]
                         else
                           root.sites(options)
                         end
        sites_for_vlan.each do |currentsite|
          currentsite.clusters(options).each do |cluster|
            cluster.nodes(options).each do |node|
              sitename=currentsite['uid']
              device = node['network_adapters'].find{|s| s['network_address'] =~ /^\w+-\d+\.\w+\.grid5000\.fr/}
              next if device.nil?
              hostname = device['network_address']
              next if hostname.nil?
              hostname_vlan = hostname.gsub(/^(\w+-\d+)(\..*)$/){$1+"-kavlan-"+vlan.to_s+$2}
              shortname_vlan = hostname_vlan.gsub(/^(\w+-\d+-\w+-\d+)(\..*)$/){$1}
              shortname = hostname.gsub(/^(\w+-\d+)(\..*)$/){$1}
              realsite = hostname.split(".")[1]
              begin
                vlan_ip = IPSocket::getaddress(hostname_vlan)
              rescue
                puts "WARN: Get address error: probably no DNS setup for vlan " +vlan.to_s+" on  site " + sitename + " , skip" if $opts.verbose;
                next
              end
              if device['mac'].nil? then
                if ref[shortname].nil? then
                  puts "WARN: mac undefined for host %s, skip" % hostname unless $opts.quiet
                  next
                else
                  mac = ref[shortname]['mac_eth0']
                end
              else
                mac = device['mac']
              end
              f.puts "   host %s {" % hostname_vlan
              f.puts "     hardware ethernet %s; " % mac
              f.puts "     option host-name \"%s\";" %  shortname_vlan
              f.puts "     option domain-name \"%s\.grid5000.fr\";" % realsite
              f.puts "     fixed-address %s;" % vlan_ip
              if vlan > 9
                # for global vlan, we need the local tftp server
                currenttftp = IPSocket::getaddress("kadeploy-server."+sitename+".grid5000.fr")
                f.puts "     next-server %s;" % currenttftp
              end
              f.puts "   }"
            end
          end
        end
        f.puts "}"
      end
    end
  end
end

(this script use restfully and ruby-ip gems)

Then, generate the configuration (replace VLANID and SITE by your current site and VLAN id), and copy it on the node:

Terminal.png frontend:
chmod +x ./gen_dhcpd_conf.rb
Terminal.png frontend:
gem install ruby-ip restfully --no-ri --no-rdoc --user-install
Terminal.png frontend:
./gen_dhcpd_conf.rb --site SITE --vlan-id VLANID
Terminal.png frontend:
scp dhcpd-kavlan-VLANID-SITE.conf root@`head -1 < nodefile`:

For user accounts, you need to specify your GEM_HOME directory because in the classical one, you won't be able to install the "ruby-ip". To make it possible, type :

Terminal.png frontend:
export GEM_HOME=~/.gem/ruby/2.3.0/

You have to disable the default DHCP server of the VLAN:

On the frontend

Terminal.png frontend:
kavlan -d

Now you have to install a DHCP server on the node (we assume the node is not yet in the job VLAN, or the vlan is routed and have access to internet for apt):

Terminal.png node:
apt-get install isc-dhcp-server

There may be an error after the installation : It's normal, you need to tell the DHCP server on which interface to listen to DHCP requests (replace "eno1" with the name of the interface on which the server should listen)

Terminal.png node:
sed -i s/INTERFACESv4=\"\"/INTERFACESv4=\"eno1\"/g /etc/default/isc-dhcp-server

You can now copy the generated configuration file and start the DHCP server :

Terminal.png node:
cp /root/dhcpd-kavlan-VLANID-SITE.conf /etc/dhcp/dhcpd.conf
Terminal.png node:
/etc/init.d/isc-dhcp-server restart

Then, in another shell, connect as root on a second node (or use kaconsole):

Terminal.png frontend:
ssh root@node-xx-kavlan-yy

And restart the network configuration:

Terminal.png node-dhcp-client:
systemctl restart networking
Terminal.png node-dhcp-client:
systemctl status networking
...
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr dhclient[2633]: DHCPREQUEST of 10.32.3.7 on eno1 to 255.255.255.255 port 67
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: DHCPDISCOVER on eno1 to 255.255.255.255 port 67 interval 10
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: DHCPREQUEST of 10.32.3.7 on eno1 to 255.255.255.255 port 67
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: DHCPOFFER of 10.32.3.7 from 10.32.3.6
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: DHCPACK of 10.32.3.7 from 10.32.3.6
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr dhclient[2633]: DHCPOFFER of 10.32.3.7 from 10.32.3.6
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr dhclient[2633]: DHCPACK of 10.32.3.7 from 10.32.3.6
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: bound to 10.32.3.7 -- renewal in 34620 seconds.
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: Sending network state change signal to nslcd...done.
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr systemd[1]: Started Raise network interfaces.

on the dhcp server, check the logs:

Terminal.png node-dhcp-server:
tail /var/log/daemon.log
...
Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPDISCOVER from 00:26:b9:3f:43:a1 via eno1
Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPOFFER on 10.32.3.7 to 00:26:b9:3f:43:a1 via eno1
Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPREQUEST for 10.32.3.7 (10.32.3.6) from 00:26:b9:3f:43:a1 via eno1
Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPACK on 10.32.3.7 to 00:26:b9:3f:43:a1 via eno1

In the four last lines, you see that your own dhcp server has given an address to the other node.

DHCP and PXE

For your information, if you need to do a PXE boot, you must change the tftp server in the generated dhcpd configuration file:

Terminal.png node:
IP=`hostname -i`
Terminal.png node:
perl -i -pe "s/next-server .*/next-server $IP;/" /etc/dhcp/dhcpd.conf

(if there is no next-server configured, you must edit the file by hand and add a line like this:

next-server XX.XX.XX.XX ;

where XX.XX.XX.XX is the IP of your node (echo $IP).

Change the VLAN of your nodes manually

Put your nodes into the reserved VLAN

If you really want to change the VLAN manually, you can, but it's much simpler to change the vlan with kadeploy.

In order to change the VLAN of the nodes manually, you must reconfigure the network after the vlan has changed; but once the VLAN has changed, you can't connect to the node! An easy way to do this is to use the 'at' command (apt-get install at if it's not installed in your nodes)

We will use Taktuk to start remote commands on several nodes at once. In this example, we will use all the nodes. Since taktuk does not handle duplicate names in the nodefile, we must first remove duplicates.

First, we will use taktuk to install at on all nodes, then the taktuk command will simply launch the network reconfiguration in one minute. Finally, we set the VLAN of all our nodes.

As we will change the network configuration of nodes, we will use an isolated kavlan (a.k.a. kavlan-local) to not interfer with the rest of Grid'5000 network.

Terminal.png frontend:
oarsub -t deploy -l {"type='kavlan-local'"}/vlan=1,walltime=2 -I
Terminal.png frontend:
kavlan -V | tee myvlan
Terminal.png frontend:
oarsub -t deploy -l nodes=2 -I
Terminal.png frontend:
kadeploy3 -e debian11-x64-base -k -f $OAR_FILE_NODES
Terminal.png frontend:
taktuk -s -l root -f $OAR_FILE_NODES broadcast exec [ "apt-get update; apt-get --yes install at" ]
Terminal.png frontend:
taktuk -s -l root -f $OAR_FILE_NODES broadcast exec [ "echo '/etc/init.d/networking restart'| at now + 1 minute " ]
Terminal.png frontend:
kavlan -i `cat myvlan` -s -f $OAR_FILE_NODES

All nodes are configured in the vlan 2. In one minute, your nodes will renegotiate their IP addresses and will be available inside the VLAN, you can connect to each of them using kaconsole or ssh (as we use a kavlan-local, you must connect to the gateway of that kavlan first):

Terminal.png frontend:
ssh kavlan-`cat myvlan`
Terminal.png kavlan-VLANID:
ssh root@suno-30-kavlan-`cat myvlan`

You can use the ip neigh command to see the known hosts in your LAN; you should only see IPs in the 192.168.66.0/24 subnet

Terminal.png node:
ip neigh
192.168.223.254 dev eno1 lladdr 02:00:00:00:01:02 REACHABLE

You should be able to ping another of your host inside your VLAN

Terminal.png node:
ping -c 3 suno-42-kavlan-2
64 bytes from 192.168.211.42: icmp_req=1 ttl=64 time=0.141 ms
64 bytes from 192.168.211.42: icmp_req=2 ttl=64 time=0.166 ms
64 bytes from 192.168.211.42: icmp_req=3 ttl=64 time=0.165 ms

--- suno-42-kavlan-2.sophia.grid5000.fr ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.141/0.157/0.166/0.015 ms

Put your nodes back into the default VLAN

First, get put the list of your nodes name with vlan in a file:

Terminal.png frontend:
uniq $OAR_NODEFILE > mynodes
Terminal.png frontend:
sed "s/\([^.]*\)\(.*\)/\1-kavlan-`cat myvlan`\2/" mynodes > mynodes-vlan

Don't forget to first start the network restarting command, but this time, we need to run ssh from the kavlan gateway, and not the frontend :

Terminal.png frontend:
ssh kavlan-VLANID
Terminal.png kavlan-VLANID:
for NODE in $(cat mynodes-vlan); do ssh root@$NODE "echo '/etc/init.d/networking restart' | at now + 1 minute "; done;

Then you can put your nodes back in the default VLAN:

Terminal.png frontend:
kavlan -s -i DEFAULT -f $OAR_NODEFILE

You should be able to ping your nodes:

Terminal.png frontend:
for i in `uniq $OAR_NODEFILE`; do ping -c 1 $i; done


Another way to put back nodes into the default VLAN is to change the vlan and then kareboot the nodes.

Terminal.png frontend:
kavlan -s -i DEFAULT -f $OAR_NODEFILE
Terminal.png frontend:
kareboot3 -f $OAR_NODEFILE -r simple

KaVLAN VPN

G5K kavlanvpn.png

This feature allows users to build a Virtual Private Network (VPN) between a KaVLAN network and the outside world. Hence, it is possible to interconnect Grid'5000 nodes with any external network (from the user's laptop to the Internet), bypassing Grid'5000 network isolation.


Warning.png Warning

This is an advanced feature. It requires a good understanding of KaVLAN, VPNs and networking in Linux


Some information:

  • This service currently uses SSH VPN at Layer-2 (Ethernet level). Since that VPN is built on top of TCP protocol, you should not expect high network performance.
  • The VPN requires two end points (or gateway) to be interconnected. On Grid'5000 side, VPN gateways are installed on kavlan-{1,2,3}.<site>.grid5000.fr servers.
  • On the user's side (outside of Grid'5000), a GNU/Linux system with root privileges is required, to act as the user's gateway.
  • On Grid'5000, the user must reserve a non-routed local kavlan network (the VPN only works with this kind of kavlan). Grid'5000 nodes must be switched into that kavlan to be accessible through the VPN.


The VPN is initiated from the user's gateway machine using a SSH connection to the appropriate kavlan-X server (which depends on kavlan network previously reserved). To enable VPN, SSH "-w" options must be used to connect to remote tap0 on kavlan-X server, with VPN tunnel configured in Ethernet mode. See ssh and ssh_config manpages for more information about those options.


Example, with KaVLAN network "1" at lyon :

  • As root, create a virtual tap device that will be connected to your kavlan using SSH VPN. (Replace $USERNAME by your user name)
 laptop: sudo ip tuntap add dev tap0 mode tap user $USERNAME
  • Assign an IP address to this interface.
 laptop: sudo ifconfig tap0 192.168.207.253/20
Warning.png Warning

The IP address you choose must be inside the kavlan network, which depends on the kavlan number you are using. See Grid5000:Network#KaVLAN_networks

  • Start the SSH VPN
 laptop: ssh -o Tunnel=ethernet -w 0:0 -N kavlan-1.lyon.g5k

If the command runs correctly, it should not output anything.

  • Options description:
    • -o Tunnel=ethernet: Use an ethernet (layer 2) VPN
    • -w 0:0: Use interface tap0 on client side (first 0) and tap0 on server side (second 0, mandatory)
    • -N  : Do not execute a remote command
    • kavlan-1.lyon.g5k: Connect to lyon kavlan-1 gateway. Trailing .g5k assumes that you appropriately configured your ssh_config to connect to Grid5000 nodes using .g5k extension
  • Client's tap0 interface is now connected to the kavlan network. You should be able to ping other nodes inside this network.
 laptop: ping 192.168.192.83
 PING 192.168.192.83 (192.168.192.83) 56(84) bytes of data.
 64 bytes from 192.168.192.83: icmp_req=1 ttl=64 time=82.7 ms
 64 bytes from 192.168.192.83: icmp_req=2 ttl=64 time=39.9 ms
 ...
Warning.png Warning

DNS hostname resolution cannot be used here, as DNS servers are inside Grid'5000 network and this command is executed from your local workstation

Other usage

Using the API

Kavlan is also available through the API. Using the job and deploy API, you can, as with the command line tools, reverve nodes with vlan and deploy nodes into a vlan.

See Vlans API tutorial and Vlans API speficiation

Use a global VLAN

With a global VLAN, you can put nodes from several sites in the same VLAN

First reserve a global vlan on one site (here sophia) and 2 nodes on lille,sophia and lyon:

Terminal.png frontend:
oargridsub -t deploy -w 2:00:00 sophia:rdef="{\\\\\\\"type='kavlan-global'\\\\\\\"}/vlan=1+/nodes=2",lille:rdef=/nodes=2,lyon:rdef=/nodes=2 > oargrid.out


Get the oargrid Id and Job key from the output of oargridsub:

Terminal.png frontend:
export OAR_JOB_KEY_FILE=`grep "SSH KEY" oargrid.out | cut -f2 -d: | tr -d " "`
Terminal.png frontend:
export OARGRID_JOB_ID=`grep "Grid reservation id" oargrid.out | cut -f2 -d= | cut -d ' ' -f2`

Get the node list using oargridstat:

Terminal.png frontend:
oargridstat -w -l $OARGRID_JOB_ID | grep grid> ~/gridnodes

Then use kadeploy3 to deploy your image on all sites and change the VLAN:

Terminal.png frontend:
kadeploy3 -f gridnodes -a http://public.sophia.grid5000.fr/~nniclausse/openmx.dsc -k --multi-server -o ~/nodes.deployed --vlan 18

If you want to manipulate directly VLAN of a node, you have to run the kavlan command on the site where the node is, e.g. if you have reserved the global vlan located at sophia and want to put some nodes of lille into this vlan, you have to run kavlan -m nodename -i VLAN_GLOBAL_ID -s on lille site (or use the API with lille site in the URL).

How to use a local VLAN

In this section, we will describe the specificity of the local VLANs.

If you want to use local VLAN, you have to first connect on the gateway of the vlan. For this, once you have a running reservation on a local VLAN, you have a ssh accces to the gateway:

Terminal.png frontend:
ssh kavlan-<vlanid>

Then you can reach your nodes inside the VLAN. Another option is to use the kaconsole command.

(You can still use kadeploy to put your nodes in the VLAN in one step.)


Configure ssh to easily connect to nodes in a local VLAN

You can configure ssh to make the connection through the gateway transparent:

In order to transparently use ssh to acces to isolated nodes (local VLAN), you should add this to your .ssh/config file on the frontend:

Host *-*-kavlan-1 *-*-kavlan-1.*.grid5000.fr
    ProxyCommand ssh -a -x kavlan-1 nc -q 0 %h %p
Host *-*-kavlan-2 *-*-kavlan-2.*.grid5000.fr
    ProxyCommand ssh -a -x kavlan-2 nc -q 0 %h %p
Host *-*-kavlan-3 *-*-kavlan-3.*.grid5000.fr
    ProxyCommand ssh -a -x kavlan-3 nc -q 0 %h %p

Then you can simply use ssh <cluster>-<nodeid>-kavlan-<vlanid> to access the node , for example:

Terminal.png frontend:
ssh root@NODE-kavlan-VLANID

A simple multi NICs example

We show here how to reserve and configure multiple Ethernet network interfaces.

First we reserve a deploy job, with 2 nodes and 2 vlans:

Terminal.png frontend:
oarsub oarsub -I -t deploy -l {"eth_count > 1 and cluster = 'cluster_name'"}/nodes=2,{"type='kavlan'"}/vlan=2,walltime=02:00:00

Then we deploy the wanted environment:

Terminal.png frontend:
kadeploy3 -f $OAR_NODEFILE -k -e debian11-x64-nfs


See cluster section to know which Ethernet interfaces can be used. For exemple, on paranoia (Rennes), eth1/enp3s0f1 and eth2/eno1 are cabled. Use I=1 and J=2 for paranoia.

Get node name with interfaces:

Terminal.png frontend:
uniq $OAR_FILE_NODES | sed -e 's/\([^\.]*\)\(.*\)/\1-ethI\2/' > nodes_second_int
Terminal.png frontend:
uniq $OAR_FILE_NODES | sed -e 's/\([^\.]*\)\(.*\)/\1-ethJ\2/' > nodes_third_int

Show vlans number:

Terminal.png frontend:
kavlan -V

Put interfaces on the two different vlan:

Terminal.png frontend:
kavlan -i vlan1 -s -f nodes_second_int
Terminal.png frontend:
kavlan -i vlan2 -s -f nodes_third_int

Get ip on second and third interface :

Terminal.png frontend:
uniq $OAR_NODEFILE | taktuk -d -1 -l root -f - broadcast exec [ 'dhclient enp3s0f1' ]
Terminal.png frontend:
uniq $OAR_NODEFILE | taktuk -d -1 -l root -f - broadcast exec [ 'dhclient eno1' ]

At this moment your node should have 3 IP:

Terminal.png node:
ip a
root@paranoia-8:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether f0:4d:a2:73:ce:3d brd ff:ff:ff:ff:ff:ff
    inet 10.24.70.8/18 brd 10.24.127.255 scope global eno1
       valid_lft forever preferred_lft forever
    inet6 fe80::f24d:a2ff:fe73:ce3d/64 scope link
       valid_lft forever preferred_lft forever
3: eno2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether f0:4d:a2:73:ce:3e brd ff:ff:ff:ff:ff:ff
4: enp3s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether a0:36:9f:28:a9:18 brd ff:ff:ff:ff:ff:ff
    inet 172.16.100.8/20 brd 172.16.111.255 scope global enp3s0f0
       valid_lft forever preferred_lft forever
    inet6 fe80::a236:9fff:fe28:a918/64 scope link
       valid_lft forever preferred_lft forever
5: enp3s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether a0:36:9f:28:a9:1a brd ff:ff:ff:ff:ff:ff
    inet 10.24.7.8/18 brd 10.24.63.255 scope global enp3s0f1
       valid_lft forever preferred_lft forever
    inet6 fe80::a236:9fff:fe28:a91a/64 scope link
       valid_lft forever preferred_lft forever