FPGA: Difference between revisions

From Grid5000
Jump to navigation Jump to search
 
(42 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{Portal|User}}
{{Portal|Tutorial}}
{{TutorialHeader}}
__TOC__
As of August 2022, Grid'5000 features 2 nodes, each equipped with one AMD/Xilinx FPGA. This document gives specific information related to how those FPGA are usable in Grid'5000.
As of August 2022, Grid'5000 features 2 nodes, each equipped with one AMD/Xilinx FPGA. This document gives specific information related to how those FPGA are usable in Grid'5000.
[[File:AMD Xilinx Alveo U200.jpg|thumb|right|AMD Xilinx Alveo U200]]
[[File:AMD Xilinx Alveo U200.jpg|thumb|right|AMD Xilinx Alveo U200]]


== Hardware description ==
== Hardware installation description ==
The Grenoble site of Grid'5000 hosts 2 servers ([[Grenoble:Hardware#servan|Servan cluster]]) equipped with an AMD/Xilinx FPGA. FPGA are [https://www.xilinx.com/products/boards-and-kits/alveo/u200.html AMD/Xilinx Alveo U200], which are PCIe Accelerator card.
The Grenoble site of Grid'5000 hosts 2 servers ([[Grenoble:Hardware#servan|Servan cluster]]) equipped with an AMD/Xilinx FPGA plugged on PCIe. FPGA are [https://www.xilinx.com/products/boards-and-kits/alveo/u200.html AMD/Xilinx Alveo U200], referenced on the Xilinx catalog as Datacenter Accelerator cards.


Detailed specifications are provided [https://www.xilinx.com/products/boards-and-kits/alveo/u200.html#specifications here] (cards with passive thermal cooling).
Detailed specifications are provided [https://www.xilinx.com/products/boards-and-kits/alveo/u200.html#specifications here] (cards with passive thermal cooling).
Line 9: Line 13:
Technically, the installation of those FPGAs in the [[Grenoble:Hardware#servan|Servan]] nodes has the following characteristics:
Technically, the installation of those FPGAs in the [[Grenoble:Hardware#servan|Servan]] nodes has the following characteristics:
* JTAG
* JTAG
** JTAG programming is provided on Xilinx Alveo U200 via a USB port on the card. In the Grid'5000 installation, it is cabled back to the hosting machine itself. Thus, programming of the FPGA of e.g. servan-1 can be done (e.g. with Vivado) from e.g. servan-1 itself.
** JTAG programming is provided on Xilinx Alveo U200 via a USB port on the card. In the Grid'5000 installation, this JTAG port is connected to a USB port of the hosting machine itself. Thus, the JTAG programming the FPGA hosted in servan-1 (resp. servan-2) can be done (e.g. with Vivado) from servan-1 (resp. servan-2) itself.
* Network
* Ethernet ports
** Both Ethernet ports of the FPGA are cabled to the site network along with all servers of the site (including the servan server NICs). [[File:Grenoble-DC-IMAG.png|thumb|right|Grenoble site network]]
** Both Ethernet ports of each FPGA are connected to the site network along with all servers of the site. [[File:Grenoble-DC-IMAG.png|thumb|right|Grenoble site network]]
** Ethernet ports are not shown as NICs in the operating system of the hosting machine (unless FPGA is programmed as to do so).
** The FPGA ethernet ports are not shown as NICs in the operating system of the hosting machine (unless FPGA is programmed as to do so).
** Ethernet ports are cabled to 100Gbps ports on Grenoble site router/switch. Switch ports are configured with Auto-Negotiation disabled and Speed forced to 100Gbps (not working otherwise, as far as we tested).
** Ports are connected to 100Gbps ports on the Grenoble site router. The router ports are configured with Auto-Negotiation disabled and Speed forced to 100Gbps (not working otherwise, as far as we tested).
** Kavlan is supported on the FPGA Ethernet ports just like any NIC of a server of the site (including the servan servers NICs). FPGA ports are named ''servan-1-fpga0'', ''servan-1-fpga1'', ''servan-2-fpga0'', ''servan-2-fpga1'' in kavlan. IP addresses are provided via DHCP to the FPGA ports in kavlan where the DHCP service is available.
** Kavlan can reconfigure the FPGA Ethernet ports just like any NIC of a server of the site (including the servan servers' own NICs). FPGA ports are named ''servan-1-fpga0'', ''servan-1-fpga1'', ''servan-2-fpga0'', ''servan-2-fpga1'' in kavlan. IP addresses are provided via DHCP to the FPGA ports in kavlan where the DHCP service is available.
** Note: using the 100Gbps capability of the FPGA ports requires acquiring a free-of-charge Xilinx licence.
** Note: using the 100Gbps capability of the FPGA ports requires acquiring a free-of-charge Xilinx license.
* Wattmeter  
* Wattmeter
** Each servan node energy consumption is measured by a wattmeter. Measures are available in [[Monitoring Using Kwollect|Kwollect]]. (Work-in-progress)
** Each servan node energy consumption is measured by a wattmeter. Measures are available in [[Monitoring Using Kwollect|Kwollect]].
** Energy consumption can also be retrieved using Xilinx tools (e.g. xbutil, xbtop) from the host operating system when the FPGA is running XRT (see below).
** Energy consumption can also be retrieved using Xilinx tools (e.g. xbutil, xbtop) from the host operating system when the FPGA is running XRT (see below), or through the CMS IP (https://www.xilinx.com/products/intellectual-property/cms-subsystem.html) if using a Vivado flow.
* Licenses
** FPGA software stack, IP, etc are subject to licenses (EULA to be signed, etc). See [https://www.xilinx.com/products/design-tools/faq.html Xilinx FAQ]. Grid'5000 does not provide licenses. It is left to the end-user to obtain the required licenses (some are free of charge).
Finally, as for any machine available in Grid'5000, the user can choose the operating system of the hosts (e.g. Centos or Ubuntu).


== Using the FPGA ==
== Using the FPGA ==
=== Programming ===
=== Programming ===
FPGA can be used in serval ways:
FPGA can be used in several ways:
* either using higher-level abstractions, e.g. using Xilinx's Vitis.
* either using higher-level abstractions, e.g. using [https://www.xilinx.com/products/design-tools/vitis/vitis-platform.html Xilinx's Vitis].
* or using lower-level abstractions, e.g. using Xilinx's Vivado.
* or using lower-level abstractions, e.g. using [https://www.xilinx.com/products/design-tools/vivado.html Xilinx's Vivado].


When used with the higher level abstractions with Vitis, the FPGA card is managed by the XRT framework, and the card shows as a datacenter accelerator card. However, it is sometimes necessary to program the card at a lower level, such as for instance becoming a network card (NIC). In such a case, the card is fully reprogrammed, so that even its PCI id changes. Hence users have to decide at what level they want to program the FPGA.
When used with the higher level abstractions with Vitis, the FPGA card is managed by the [https://www.xilinx.com/products/design-tools/vitis/xrt.html XRT framework], and the card shows as a datacenter accelerator card. However, it is sometimes necessary to program the card at a lower level, for instance, to program it as a network card (NIC). In such a case, the card is fully reprogrammed, so that even its PCI id changes. Hence. depending on their need, users have to decide at what level they want to program the FPGA.


Regarding the programming of the operations of the FPGA (typically with Vivado), several options are also available:
Regarding the programming of the operations of the FPGA (i.e. deploying a program on the FPGA) with Vivado, several options are also available:
* Via PCI-e.
* Via PCI-e.
** PCI-e programming may not be available as it requires the FPGA to possibly already operate PCI-e support for programming.  
** PCI-e programming may not be available as it requires the FPGA to possibly already operate PCI-e support for programming.  
* Via JTAG, by flashing the program on the board-embedded non-volatile memory that lives beside the FPGA (using a .mcs file).
* Via JTAG, by flashing the program (using a .mcs file) on the embedded non-volatile memory of the PCI board, that lives beside the FPGA .
** Flashing the non-volatile memory requires a cold reboot of the hosting server to make the FPGA utilize the flashed program. It makes the programming persistent, which means flashing a factory golden image will be required to revert the FPGA to its original operating mode.
** Flashing the non-volatile memory requires a subsequent cold reboot of the hosting server to make the FPGA utilize the flashed program. It makes the programming persistent, which means flashing a factory golden image will be required to revert the FPGA back to its original operating mode.
* Via JTAG, directly in the FPGA's volatile memory (using a bitstream, .bit file).
* Via JTAG, by flashing directly in the FPGA's volatile memory (using a bitstream, .bit file).
** By programming the volatile memory, the FPGA will run the program straight away. A warm reboot may be required to make a program (e.g. if modifying the PCI-e) functional. A cold reboot will revert the FPGA to run the program installed in the non-volatile memory of the board.
** By programming the volatile memory, the FPGA will run the program straight away. A warm reboot may be required to make a program (e.g. if modifying the PCI-e) functional. A cold reboot will revert the FPGA to run the program installed in the non-volatile memory of the board.


'''As a result, it is recommended to prefer programming the FPGA via the volatile memory, so that the new programming is NOT persistent, and the FPGA returns back to its default operating mode after a cold reboot, typically after the reservation/job'''.
'''As a result, it is strongly recommended to prefer programming the FPGA via JTAG in the VOLATILE memory, so that the new programming is NOT persistent, and the FPGA returns back to its default operating mode after a cold reboot, typically after the reservation/job'''.


=== FPGA software stack ===
=== FPGA software stack ===
The servan nodes just like all Grid'5000 nodes are running Debian by default. No support is provided in that default operating system environment.
==== Using supported system environments ====
The servan nodes just like all Grid'5000 nodes are running Debian stable by default. The AMD/Xilinx FPGA software stack is not available in that operating system environment.


AMD/Xilinx supports a limited list of OS to operate the FPGA, see [https://support.xilinx.com/s/article/XRT-Xilinx-Runtime-version-2022-1-2-13-Tested-Operating-Systems here]. A kadeploy Ubuntu 20.04 image is provided by Pierre Neyron, which includes the Xilinx tools. Anyone may deploy it using:
AMD/Xilinx supports a limited list of OS to operate the FPGA, see [https://support.xilinx.com/s/article/XRT-Xilinx-Runtime-version-2022-1-2-13-Tested-Operating-Systems here].  
kadeploy -u pneyron -e ubuntu2004-fpga
That ubuntu2004-fpga system environment is built using a [[Environment creation|kameleon recipe]] available in https://gitlab.inria.fr/neyron/ubuntu2004-fpga.


Vitis and Vivado are installed on a NFS shared storage (/tools/ in the deployed Ubuntu system), which access can be requested to [mailto:pierre.neyron@imag.fr Pierre Neyron].
The AMD/Xilinx FPGA software stack can be installed on top of the technical team's supported Ubuntu (e.g. ''ubuntu2004-min'') or Centos (e.g. ''centos8-min'') environment after deploying it on the node with [[Advanced_Kadeploy|Kadeploy]].


=== Developping code for the FPGA ===
See https://www.xilinx.com/products/boards-and-kits/alveo/u200.html#gettingStarted.
Getting help on how to code a program for the FPGA is out of the scope of this document.
 
{{Note|text=Please note that the FPGA software stack may be deployed on any node, possibly not equipped with an FPGA board. This may be useful so that FPGA designs can be built or tested with emulation on any node, many nodes, but letting the servan nodes available for actual tests on the FPGA hardware.}}
 
==== User contributed Xilink software environment ====
A user-contributed Kadeploy environment named <code>ubuntu2004-fpga</code> and recorded in the Kadeploy registry in Pierre Neyron's userspace (<code>pneyron</code>) includes the Xilinx tools. Anyone may deploy it using the following command line in a [[Getting_Started#Deploying_your_nodes_to_get_root_access_and_create_your_own_experimental_environment|OAR job of type deploy]]:
 
{{Term|location=fgrenoble|cmd=<code class="command">kadeploy3</code> -u pneyron ubuntu2004-fpga}}
 
Because of the large disk space that they require (~50GB), Xilinx's Vitis and Vivado tools are however (pre-)installed separately on a shared NFS storage (mounted in /tools/ in the deployed <code>ubuntu2004-fpga</code> system). Those tools are subject to an end-user license agreement (EULA). Access to that shared NFS storage can be granted on request by contacting [mailto:pierre.neyron@imag.fr Pierre Neyron].
 
{{Note|text=Please note that this ''ubuntu2004-fpga'' system environment is built using a [[Environment_creation#Creating_an_environment_from_a_recipe_using_kameleon|kameleon recipe]] available in https://gitlab.inria.fr/neyron/ubuntu2004-fpga.
You may modify the recipe and rebuild your own image, adapted to your need. An image on top of ''centos8-min'' could also be built.}}
 
{{Note|text=Please note that the 2 servan nodes are shut-down by default ([https://intranet.grid5000.fr/oar/Grenoble/drawgantt-svg/?filter=servan%20only&resource_base=host&scale=20 standby in the OAR terminology]), which means that after reserving with <code class=command>oarsub</code>, one has to wait for a node to boot before being able to use it.}}
 
== Some links ==
* Xilinx Developer program
** https://www.xilinx.com/developer.html
** https://www.xilinx.com/developer/articles.html
* Some usages with Vitis
** https://xilinx.github.io/XRT/master/html/platforms.html
** https://github.com/Xilinx/Vitis-Tutorials
** https://xilinx.github.io/Vitis_Libraries/hpc/2021.2/index.html
** https://www.xilinx.com/developer/articles/migrating-from-cuda-to-vitis.html
** https://xilinx.github.io/XRT/2022.1/html/p2p.html
* Some projects making use of the Ethernet ports of the FPGAs
** https://github.com/Xilinx/open-nic
** https://docs.corundum.io/en/latest/contents.html
* Tutorials
** https://github.com/litex-hub/fpga_101

Latest revision as of 09:47, 26 May 2023

Note.png Note

This page is actively maintained by the Grid'5000 team. If you encounter problems, please report them (see the Support page). Additionally, as it is a wiki page, you are free to make minor corrections yourself if needed. If you would like to suggest a more fundamental change, please contact the Grid'5000 team.

As of August 2022, Grid'5000 features 2 nodes, each equipped with one AMD/Xilinx FPGA. This document gives specific information related to how those FPGA are usable in Grid'5000.

AMD Xilinx Alveo U200

Hardware installation description

The Grenoble site of Grid'5000 hosts 2 servers (Servan cluster) equipped with an AMD/Xilinx FPGA plugged on PCIe. FPGA are AMD/Xilinx Alveo U200, referenced on the Xilinx catalog as Datacenter Accelerator cards.

Detailed specifications are provided here (cards with passive thermal cooling).

Technically, the installation of those FPGAs in the Servan nodes has the following characteristics:

  • JTAG
    • JTAG programming is provided on Xilinx Alveo U200 via a USB port on the card. In the Grid'5000 installation, this JTAG port is connected to a USB port of the hosting machine itself. Thus, the JTAG programming the FPGA hosted in servan-1 (resp. servan-2) can be done (e.g. with Vivado) from servan-1 (resp. servan-2) itself.
  • Ethernet ports
    • Both Ethernet ports of each FPGA are connected to the site network along with all servers of the site.
      Grenoble site network
    • The FPGA ethernet ports are not shown as NICs in the operating system of the hosting machine (unless FPGA is programmed as to do so).
    • Ports are connected to 100Gbps ports on the Grenoble site router. The router ports are configured with Auto-Negotiation disabled and Speed forced to 100Gbps (not working otherwise, as far as we tested).
    • Kavlan can reconfigure the FPGA Ethernet ports just like any NIC of a server of the site (including the servan servers' own NICs). FPGA ports are named servan-1-fpga0, servan-1-fpga1, servan-2-fpga0, servan-2-fpga1 in kavlan. IP addresses are provided via DHCP to the FPGA ports in kavlan where the DHCP service is available.
    • Note: using the 100Gbps capability of the FPGA ports requires acquiring a free-of-charge Xilinx license.
  • Wattmeter
  • Licenses
    • FPGA software stack, IP, etc are subject to licenses (EULA to be signed, etc). See Xilinx FAQ. Grid'5000 does not provide licenses. It is left to the end-user to obtain the required licenses (some are free of charge).

Finally, as for any machine available in Grid'5000, the user can choose the operating system of the hosts (e.g. Centos or Ubuntu).

Using the FPGA

Programming

FPGA can be used in several ways:

When used with the higher level abstractions with Vitis, the FPGA card is managed by the XRT framework, and the card shows as a datacenter accelerator card. However, it is sometimes necessary to program the card at a lower level, for instance, to program it as a network card (NIC). In such a case, the card is fully reprogrammed, so that even its PCI id changes. Hence. depending on their need, users have to decide at what level they want to program the FPGA.

Regarding the programming of the operations of the FPGA (i.e. deploying a program on the FPGA) with Vivado, several options are also available:

  • Via PCI-e.
    • PCI-e programming may not be available as it requires the FPGA to possibly already operate PCI-e support for programming.
  • Via JTAG, by flashing the program (using a .mcs file) on the embedded non-volatile memory of the PCI board, that lives beside the FPGA .
    • Flashing the non-volatile memory requires a subsequent cold reboot of the hosting server to make the FPGA utilize the flashed program. It makes the programming persistent, which means flashing a factory golden image will be required to revert the FPGA back to its original operating mode.
  • Via JTAG, by flashing directly in the FPGA's volatile memory (using a bitstream, .bit file).
    • By programming the volatile memory, the FPGA will run the program straight away. A warm reboot may be required to make a program (e.g. if modifying the PCI-e) functional. A cold reboot will revert the FPGA to run the program installed in the non-volatile memory of the board.

As a result, it is strongly recommended to prefer programming the FPGA via JTAG in the VOLATILE memory, so that the new programming is NOT persistent, and the FPGA returns back to its default operating mode after a cold reboot, typically after the reservation/job.

FPGA software stack

Using supported system environments

The servan nodes just like all Grid'5000 nodes are running Debian stable by default. The AMD/Xilinx FPGA software stack is not available in that operating system environment.

AMD/Xilinx supports a limited list of OS to operate the FPGA, see here.

The AMD/Xilinx FPGA software stack can be installed on top of the technical team's supported Ubuntu (e.g. ubuntu2004-min) or Centos (e.g. centos8-min) environment after deploying it on the node with Kadeploy.

See https://www.xilinx.com/products/boards-and-kits/alveo/u200.html#gettingStarted.

Note.png Note

Please note that the FPGA software stack may be deployed on any node, possibly not equipped with an FPGA board. This may be useful so that FPGA designs can be built or tested with emulation on any node, many nodes, but letting the servan nodes available for actual tests on the FPGA hardware.

User contributed Xilink software environment

A user-contributed Kadeploy environment named ubuntu2004-fpga and recorded in the Kadeploy registry in Pierre Neyron's userspace (pneyron) includes the Xilinx tools. Anyone may deploy it using the following command line in a OAR job of type deploy:

Terminal.png fgrenoble:
kadeploy3 -u pneyron ubuntu2004-fpga

Because of the large disk space that they require (~50GB), Xilinx's Vitis and Vivado tools are however (pre-)installed separately on a shared NFS storage (mounted in /tools/ in the deployed ubuntu2004-fpga system). Those tools are subject to an end-user license agreement (EULA). Access to that shared NFS storage can be granted on request by contacting Pierre Neyron.

Note.png Note

Please note that this ubuntu2004-fpga system environment is built using a kameleon recipe available in https://gitlab.inria.fr/neyron/ubuntu2004-fpga. You may modify the recipe and rebuild your own image, adapted to your need. An image on top of centos8-min could also be built.

Note.png Note

Please note that the 2 servan nodes are shut-down by default (standby in the OAR terminology), which means that after reserving with oarsub, one has to wait for a node to boot before being able to use it.

Some links