Energy consumption monitoring tutorial: Difference between revisions

From Grid5000
Jump to navigation Jump to search
No edit summary
 
Line 3: Line 3:
{{TutorialHeader}}
{{TutorialHeader}}


= Introduction =


{{Note|text=This wiki page is available as an interactive [https://www.grid5000.fr/w/Notebooks notebook], that allows you to directly run examples on a reserved node. To do so, clone [https://gitlab.inria.fr/grid5000/notebook-tutorial this repository] in your homedir and open the  <code>Kwollect_tutorial.ipynb</code> it in your Jupyter lab instance. }}
Estimate duration: 90 minutes


This tutorial will show how to monitor energy on Grid'5000.
<span id="introduction"></span>
== Introduction ==


On Lyon, Grenoble and Nancy sites, special devices (called "wattmeter") allow fine grained measurements (50 measure each second, with sub-watt resolution). In addition, electrical power consumption of nodes may sometimes be retrieved from their Power Distribution Units (PDU), the device which supply them with electrical power. While less precise, many clusters also have monitoring capabilities provided by their "BMC" (server specific adminstration board).
In this tutorial, you will learn how to monitor electrical energy consumption while experimenting on server-class machines under Grid’5000.


Grid'5000 uses the [[Monitoring Using Kwollect|Kwollect tool]] to provide a convenient and consistent way to monitor energy consumption, and other monitoring metrics, in experiments.
The tutorial will be organized into the following sections:


In the tutorial, you will learn how to retrieve energy consumed by Grid'5000 nodes. The power consumption will be studied under various workload scenario and combinations of CPU energy saving parameters (P-State, C-State, etc.).
* Monitoring devices available
* Find monitoring features available on a node
* Getting metrics values from Kwollect
* Monitoring of internal metrics (e.g. RAPL)
* Advanced case: enable high-frequency monitoring on Wattmetres and other on-demand metrics
* Advanced case: find energy consumption for individual power supply
* Practical study


This tutorial requires a basic knowledge of Grid'5000 usage (i.e. having completed [[Getting Started]] tutorial).
The first four sections explain the basics of energy monitoring under Grid’5000 and should not be skipped. The two “Advanced” sections are optional. The “Practical study” is an exercise to put into practice what you have learned.


= Retrieving energy consumption data =
<span id="monitoring-devices-available"></span>
== Monitoring devices available ==


== Using Kwollect ==
Grid’5000 provides access to various monitoring devices that measure the electrical power consumed by nodes, such as:


[https://gitlab.inria.fr/grid5000/kwollect Kwollect] is a monitoring tool focus on environmental metrics such as electrical consumption. On Grid'5000, it permanently collects metrics on every nodes, network equipment, PDUs and Wattmeter and store them in a long term storage. Collected metrics are exposed to users through Grid'5000 API and a visualization dashboard based on Grafana.
<ul>
<li><p>“Wattmetres” which are specialized devices localized between a node power supply and its power source and able to perform up to 50 measurements per second, with a high relative precision.</p>
<p>The current generation of Wattmetres installed on the infrastructure is made by [https://www.adecwatts.fr/Wattmetre-et-analyseur-de-reseau-bt/ ADECWatts company].</p>
<center>
<div class="figure">
[[File:Wattmv3-lyon.jpeg|400px|Wattmetres are located on the left side of the rack]]
</div>
<p>(Wattmetres are located on the left side of the rack)</p></li>
</center>
<li><p>PDU (Power Delivery Units), which are the most common way to deliver electrical power to server-class nodes used in Grid’5000, may also export energy monitoring metrics. However, they are less precise than Wattmetres.</p>
<center>
<div class="figure">
[[File:APC_10-outlet_rackmount_19-inch_PDU.jpg|400px|PDU on the left used to power servers]]
</div>
<p>(PDU on the left is used to power servers)</p></li>
</center>
<li><p>BMC (Baseboard Management Controller) are control units placed inside a server chassis, while remaining independent of the rest of the system. They also export energy monitoring metrics, but are even less precise than PDU and thus, Wattmetres (in particular, they are localized downstream from the power supply and therefore cannot take into account its energy losses).</p></li></ul>
 
In addition, individual components inside a node may provide energy monitoring. This is typically the case for CPU and GPU, which provide energy consumed by various internal parts by exposing internal hardware counters accessible through a dedicated interface, such as RAPL for Intel &amp; AMD CPUs and NVML for NVIDIA GPUs. See these references for more information: [https://hubblo-org.github.io/scaphandre-documentation/explanations/rapl-domains.html 1] [https://github.com/bpetit/awesome-energy 2] [https://developer.nvidia.com/management-library-nvml 3].
 
⚠️ Warning ⚠️: Monitoring metrics are not always reliable: the monitoring device may fail and report wrong values; PDU and BMC often report inaccurate values (e.g., updated at a low frequency, heavily smoothed, using a moving average, etc.). It is strongly recommended to cross-check your measurements by using different monitoring devices (e.g., both Wattmetres and BMC) to ensure confidence in results.
 
<span id="find-monitoring-features-available-on-a-node."></span>
== Find monitoring features available on a node. ==
 
Monitoring, like everything else on Grid’5000, is documented in the Grid’5000 Reference API. Let’s see how to query this API to discover what monitoring devices are available for a particular node.
 
Monitoring capabilities are described in terms of ''metrics'' available on a cluster. To get the list of all metrics available for a cluster, the API can be queried at this address:
 
<pre>https://api.grid5000.fr/stable/sites/&lt;SITE&gt;/clusters/&lt;CLUSTER&gt;</pre>
For instance, if you are interested in metrics available on taurus-12 node at Lyon, you can query the following URL:
 
https://api.grid5000.fr/stable/sites/lyon/clusters/taurus
 
Metrics are described under the <code>metrics</code> entry of the JSON document returned by the command. To get a better view of the metrics list, you can use a command such as:
 
<pre>curl https://api.grid5000.fr/stable/sites/lyon/clusters/taurus | jq '.metrics' | less</pre>
As you can see, many metrics are available, related to energy monitoring or not. More information about general monitoring in Grid’5000, including the full list of available metrics, is available in [https://www.grid5000.fr/w/Monitoring_Using_Kwollect Monitoring Using Kwollect] documentation.
 
We will focus on two metrics for this tutorial: <code>wattmetre_power_watt</code> and <code>bmc_node_power_watt</code>. The Reference API description of these metrics looks like this:
 
<syntaxhighlight lang="json">{
"description": "Power consumption of node reported by Wattmetre, in watt",
"name": "wattmetre_power_watt",
"optional_period": 20,
"period": 1000,
"source": {
  "protocol": "wattmetre"
}
},
{
"description": "Power consumption of node reported by BMC, in watt",
"name": "bmc_node_power_watt",
"period": 5000,
"source": {
  "id": "1.3.6.1.4.1.674.10892.5.4.600.30.1.6.1.{{ 1.3.6.1.4.1.674.10892.5.4.600.30.1.8.1 == System Board Pwr Consumption }}",
  "protocol": "snmp"
}
}</syntaxhighlight>
* The <code>description</code> field explains the nature of the monitoring devices, as well as the physical unit of the measure.
* The <code>name</code> field is the metric’s identifier used throughout the monitoring system.
* The <code>period</code> field describes the interval, in milliseconds, between two consecutive measurements performed on the monitoring device. (Note that this does not necessarily correspond to the frequency with which the device itself update its internal value. It can be larger, especially on BMC, as said in the Warning section above).
* The <code>optional_period</code> field, only available on the <code>wattmetre_power_watt</code> metric, indicates that this device can be configured to perform even more frequent measurements every 20 ms (i.e., at 50 Hz), on user’s demand (more on that later).
* The <code>source</code> field indicates the protocol used to query the monitoring device and should not be of much interest to you.
 
<span id="getting-metrics-values-from-kwollect"></span>
== Getting metrics values from Kwollect ==
 
Once you have identified the nodes and the metrics you are interested in, you can simply query Kwollect, the monitoring system used in Grid’5000, to retrieve metrics values over time. For instance, to get <code>wattmetre_power_watt</code> and <code>bmc_node_power_watt</code> metrics values for <code>taurus-4</code> and <code>taurus-5</code> between 10:00 and 10:10 the 1st of May 2025, you can query the API at:
 
https://api.grid5000.fr/stable/sites/lyon/metrics?nodes=taurus-4,taurus-5&metrics=wattmetre_power_watt,bmc_node_power_watt&start_time=2025-05-01T10:00&end_time=2025-05-01T10:10
 
This will return a JSON document like:
 
<syntaxhighlight lang="json">[{"timestamp":"2025-05-01T10:00:00+02:00","device_id":"taurus-5","metric_id":"wattmetre_power_watt","value":6.052631578947369,"labels":{"_device_orig": ["wattmetre1-port4"]}},
{"timestamp":"2025-05-01T10:00:00+02:00","device_id":"taurus-4","metric_id":"wattmetre_power_watt","value":6.332432432432432,"labels":{"_device_orig": ["wattmetre1-port3"]}},
{"timestamp":"2025-05-01T10:00:00.654002+02:00","device_id":"taurus-5","metric_id":"bmc_node_power_watt","value":0,"labels":{}},
{"timestamp":"2025-05-01T10:00:00.654322+02:00","device_id":"taurus-4","metric_id":"bmc_node_power_watt","value":0,"labels":{}},
{"timestamp":"2025-05-01T10:00:01+02:00","device_id":"taurus-4","metric_id":"wattmetre_power_watt","value":6.070731707317074,"labels":{"_device_orig": ["wattmetre1-port3"]}},
{"timestamp":"2025-05-01T10:00:01+02:00","device_id":"taurus-5","metric_id":"wattmetre_power_watt","value":6.239024390243902,"labels":{"_device_orig": ["wattmetre1-port4"]}},
...</syntaxhighlight>
where each line corresponds to a single measurement.
 
It is also possible to get all metrics associated to a Grid’5000 reservation by providing OAR job number:
 
https://api.grid5000.fr/stable/sites/lyon/metrics?job_id=1899135
 
This will return all metrics from all nodes belonging to the reservation, but you can filter by using the <code>nodes</code> and <code>metrics</code> parameters.
 
A graphical dashboard is also available to visualize metrics. It is available at:
 
https://api.grid5000.fr/stable/sites/lyon/metrics/dashboard
 
You can change <code>lyon</code> with the site you need.
 
It can be noted that metrics stored in Kwollect are kept indefinitely.
 
<span id="monitoring-of-internal-metrics"></span>
== Monitoring of internal metrics ==
 
We call “internal metrics” the metrics available from inside the node operating system, i.e., that you can fetch yourself as a user, unlike metrics fetched from external devices, such as Wattmetres, provided by the infrastructure. This kind of metrics includes RAPL for CPU energy consumption, NVML form GPU consumption, but also any kind of metrics available from the system, such CPU or IO usage.
 
As many tools are available to get internal metrics, we assume that you will want to use the one that’s best fit your needs. We will explain a generic way to push metrics to Kwollect, so it can be adapted whatever the tool used. We will also introduce [https://alumet.dev/ Alumet] usage, a convenient tool to fetch internal metrics which has a “native” Kwollect export feature.
 
In any case, you will be able to access all your metrics, both internal and external from devices such as Wattmetres and BMC, through the same API using Kwollect.
 
<span id="pushing-metrics-to-kwollect"></span>
=== Pushing metrics to Kwollect ===
 
It is possible to push metrics to Kwollect, from inside a node, by performing a POST request to following API endpoint:
 
<pre>https://api.grid5000.fr/stable/sites/SITE/metrics</pre>
The request must include the list of metrics to be inserted, formatted as a JSON like:
 
<syntaxhighlight lang="json">[{"metric_id": "METRIC_NAME1", "value": VALUE1}, {"metric_id": "METRIC_NAME2", "value": VALUE2}, …]</syntaxhighlight>
For each metric, a <code>timestamp</code> value can optionally be provided (otherwise, the current time will be used as the metric’s timestamp). The <code>device_id</code> field can also be given (if it corresponds to a node under reservation by the user making the request), otherwise, the node from which the request originates will be used. Finally, a <code>labels</code> field can be added to provide arbitrary metadata formatted as JSON.
 
As an example, this little shell script shows how to use this feature from a reserved node. Each second, it will fetch the energy consumed by CPU cores from RAPL using “Linux Perf” tool and push the resulting values to Kwollect:


Kwollect usage on Grid'5000 has a [[Monitoring Using Kwollect|dedicated documentation]].
<syntaxhighlight lang="bash">while true; do
  echo "Fetching power consumption by CPU cores using RAPL"
  V=$(sudo-g5k perf stat -e power/energy-cores/ -x"," sleep 1 2>&1 | grep Joules | cut -d',' -f1)
  echo "Average power during last second: $V W, pushing to Kwollect"
  curl https://api.grid5000.fr/stable/sites/lyon/metrics -X POST -H 'content-type: application/json' -d '{"metric_id": "my_cores_power_watt", "value": '$V'}'
  sleep 1
done</syntaxhighlight>
The <code>my_cores_power_watt</code> metric values will be available as usual from Kwollect, e.g., by requesting at:


The visualization interface can display &quot;live&quot; view of energy being consumed by a node or by a group of nodes inside an OAR reservation. However for experimenting purpose, it may be more useful to get access to raw values available using APIs. It is available at <code>https://api.grid5000.fr/stable/sites/SITE/metrics/dashboard</code> (for instance: [https://api.grid5000.fr/stable/sites/lyon/metrics/dashboard Lyon])
<pre>https://api.grid5000.fr/stable/sites/lyon/metrics?job_id=MY_JOB_ID,metrics=my_cores_power_watt</pre>
<span id="using-alumet-adaptive-lightweight-unified-metrics"></span>
=== Using Alumet (Adaptive, Lightweight, Unified Metrics) ===


The Grid'5000 API is particularly suited to get data for measures performed in the past. For instance, to get the power consumed by nodes &quot;taurus-1&quot; and &quot;taurus-3&quot; as reported by wattmeters, at Lyon, between 10:35 and 10:40 on March, 21, use the URL:
[https://alumet.dev/ Alumet] is a versatile monitoring tool that provides a generic measurement pipeline with three steps: poll measurement sources, transform the data, and write the result. It is designed to be able to ingest metrics from various sources without redundant work. Supported sources include RAPL domains, Nvidia’s NVML, and Jetson INA sensors.


https://api.grid5000.fr/stable/sites/lyon/metrics?nodes=taurus-1,taurus-3&metrics=wattmetre_power_watt&start_time=2021-03-21T10:35&end_time=2021-03-21T10:40
Alumet can be configured to monitor internal metrics of a Grid’5000 and export them to Kwollect using the “push feature” described above.


(beware if using this URL on a command line, quote it to avoid '&' being interpreted as the job control operator to put the command in background)
We are going to present an example of this joint use of Alumet and Kwollect for energy monitoring: Alumet is used to monitor RAPL metrics and exports them to Kwollect. Then, you will be able, by querying Kwollect API, to compare measurements from RAPL to external monitoring devices provided by Wattmetres and BMC.


Note that the time range provided should be of the same order of magnitude as a typical job duration (e.g. no more than a few hours). Otherwise, requests must be serialized.
First, Alumet needs to be installed on the reserved Grid’5000 node. Kwolllect support currently requires the latest Git version, to make things easier, the binary is available under Grid’5000 at: <code>http://public.lyon.grid5000.fr/~sdelamare/alumet-agent</code>. You can execute the following commands to get Alumet:


Other power consumption metrics may be available on clusters, such as: ''pdu_outlet_power_watt'' and ''bmc_node_power_watt''. See [[Monitoring_Using_Kwollect#Metrics_available_in_Grid.275000|Kwollect documentation]] for full list of metrics.
<syntaxhighlight lang="bash">wget http://public.lyon.grid5000.fr/~sdelamare/alumet-agent
chmod +x alumet-agent</syntaxhighlight>
Then, we will use a <code>alumet-config.toml</code> configuration file to setup the ''rapl'' input plugin and the ''kwollect-output'' plugin with the following content:


Note that by default, wattmetre values are collected every one second using Kwollect (it stores the average of the 50 measurements performed over one second). If you need the 50 measurments every second, you must tell Kwollect to enable wattmetre's high frequency monitoring for your job at submission time:
<pre class="toml">[plugins.rapl]
poll_interval = &quot;1s&quot;
flush_interval = &quot;5s&quot;
no_perf_events = false


$ oarsub -I -t monitor='wattmetre_power_watt'
[plugins.kwollect-output]
url = &quot;https://api.grid5000.fr/stable/sites/SITE/metrics&quot;
append_unit_to_metric_name = true
use_unit_display_name = false</pre>
Remind replacing <code>SITE</code> in the URL entry by the site where your reserved node is located.


== Raw wattmeters data ==
To access to RAPL metrics, we need a privileged configuration that must be setup using:


Lyon, Grenoble and Nancy sites provide dedicated devices, called "wattmetres", to monitor energy consumption (more information in [[Grenoble:Wattmetre]], [[Lyon:Wattmetre]], [[Nancy:Wattmetre]]). Nodes may be powered by 1 (e.g. clusters in Lyon, gros cluster in Nancy, troll cluster in Grenoble) or 2 supply units (e.g. yeti cluster in Grenoble). All power supply units are measured individually by the wattmeters, providing the electrical power consumed every 20 milliseconds (50hz), with a precision of 0.1 watts.
<pre>sudo-g5k
sudo sysctl -w kernel.perf_event_paranoid=0</pre>
Finally, run Alumet with:


As seen above, the wattmetres values are provided by Kwollect. In addition, "raw" data collected by wattmetres, including the 50 measurements made each second, is stored in CSV files and available from Grid5000 network to download at: <code>http://wattmetre.lyon.grid5000.fr/data</code>, <code>http://wattmetre.grenoble.grid5000.fr/data</code> and <code>http://wattmetre.nancy.grid5000.fr/data</code>. Downloading raw data files might be more appropriate than using Kwollect to get monitoring values over a large period of time.
<pre>./alumet-agent --config alumet-config.toml --plugins rapl,kwollect-output run</pre>
As specified in the configuration file, this will fetch RAPL metrics every second and push them to Kwollect.


For each wattmetre, a new file is recorded at the beginning of every hour (files from past hours are kept compressed). The file name format is "power.csv.<YYYY-MM-DD>T<HH>", where <YYYY-MM-DD> is the date of the recording and <HH> the hour when it begun.
The name of the RAPL metrics used by Alumet is “rapl_consumed_energy_J” (RAPL indeed performs energy measurements and the units used are Joules). You can look at these metrics by querying:


Here is the meaning of columns in the CSV files:
https://api.grid5000.fr/stable/sites/lyon/metrics?nodes=taurus-11&metrics=rapl_consumed_energy_J&start_time=2025-07-03T12:15
* 1st and 2nd columns: Debugging information (these columns will be removed in the future)
* 3rd column: Timestamp when the measure was performed (as number of seconds and nano-seconds since 00:00:00 1970-01-01 UTC).
* 4th column: Must be "OK" if the measure has correctly been performed, other it should be discarded
* From 5th column to the last: Electrical power consumed for each wattmetre's port. The 5th column shows value for port number 0, the 6th for port number 1, etc. (beware that for yeti cluster in Grenoble, several ports are used to supply a single node). Sometimes the value may be missing for a particular port. It means that wattmetre was not able to compute it correctly.


The mapping between wattmetres' ports and Grid'5000 nodes is available in the Reference API. For instance, nodes connected to "wattmetre1" at Lyon are described at:
(replace <code>lyon</code>, <code>taurus-11</code> by what is appropriate for you. If <code>start_time</code> is omitted, the metrics from last 5 minutes will be returned).


https://api.grid5000.fr/stable/sites/lyon/pdus/wattmetre1.json
A single metric looks like this:


under "ports" section, and wattmetre and port number connected to "taurus-1" node is available at:
<syntaxhighlight lang="json">{
  "timestamp": "2025-07-03T12:19:59.846832+02:00",
  "device_id": "taurus-11",
  "metric_id": "rapl_consumed_energy_J",
  "value": 0.87835693359375,
  "labels": {
    "domain": "pp0",
    "consumer_id": "",
    "_insert_user": "sdelamare",
    "ressource_id": "0",
    "__insert_time": 1751538003.850289,
    "consumer_kind": "local_machine",
    "ressource_kind": "cpu_package"
  }
}</syntaxhighlight>
Pay attention to <code>labels</code> content. It provides information about the specific [https://hubblo-org.github.io/scaphandre-documentation/explanations/rapl-domains.html RAPL domain] associated to this particular measurement. In this case, the <code>&quot;domain&quot;: &quot;pp0&quot;</code> entry means that this measure is the energy consumed by CPU’s cores and <code>&quot;ressource_id&quot;: &quot;0&quot;</code> means that it only applies to the first CPU of the node.


https://api.grid5000.fr/stable/sites/lyon/clusters/taurus/nodes/taurus-1.json
Finally, to get metrics from both external monitoring devices and RAPL, you can perform a query such as:


under "pdu" section.
https://api.grid5000.fr/stable/sites/lyon/metrics?nodes=taurus-11&metrics=wattmetre_power_watt,bmc_node_power_watt,rapl_consumed_energy_J&start_time=2025-07-03T12:15


== Intel RAPL data ==
But take care when comparing measures from RAPL and from Wattmetre or BMC: - RAPL measurements only concern a specific component of the system (CPU, DRAM, etc.), except for the “PSys” domain, which should encompass the whole system but which is loosely specified and only available on some recent hardware - The measurements from Wattmetres and BMC are power measurements, representing an average power usage over a period of time (one second by default for Wattmetres). The RAPL measurement reported by Alumet represents the total energy consumption during the period of time between the previous measurement and the current one. Remember that one Joule of energy is corresponding to a power usage of one Watt during one second.


Due to [https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.19.157 security reasons], to be able to read power data using the powercap interface, you need to authorize non-privileged user access in ''/sys/class/powercap/intel-rapl*/*/energy_uj'' on the node.
It can be noted that Alumet provides other modules to get consumption from GPU using NVML or from NVIDIA Jetson, and for other kind of metrics (CPU usage…)


For example with sudo-g5k in the std environment:
<span id="advanced-case-enable-high-frequency-monitoring-on-wattmetres-and-others-on-demand-metrics"></span>
$ sudo-g5k chmod 444 /sys/class/powercap/intel-rapl/intel-rapl*/energy_uj
== Advanced case: Enable high-frequency monitoring on Wattmetres and others on-demand metrics ==


= Power consumption under different workloads =
Some metrics are not monitored by default, or at a lower frequency. Let’s go back to the metrics description in the Reference API:


In the previous section, we have learned how to retrieve energy consumption information. In this part, we will illustrate these monitoring features in an example scenario: We will show how energy consumption evolves under different workload, and the impact of various CPU's energy-related parameters.
<pre>curl https://api.grid5000.fr/stable/sites/lyon/clusters/taurus | jq '.metrics' | less</pre>
<syntaxhighlight lang="json">{
"description": "Power consumption of node reported by Wattmetre, in watt",
"name": "wattmetre_power_watt",
"optional_period": 20,
"period": 1000,
"source": {
  "protocol": "wattmetre"
}
},
{
"description": "Power consumption of node reported by BMC, in watt",
"name": "bmc_node_power_watt",
"period": 5000,
"source": {
  "id": "1.3.6.1.4.1.674.10892.5.4.600.30.1.6.1.{{ 1.3.6.1.4.1.674.10892.5.4.600.30.1.8.1 == System Board Pwr Consumption }}",
  "protocol": "snmp"
}
},
{
  "description": "Voltage of PSU 1 reported by BMC, in volt",
  "labels": {
    "psu": "1"
  },
  "name": "bmc_psu_voltage_volt",
  "optional_period": 5000,
  "period": 0,
  "source": {
    "id": "1.3.6.1.4.1.674.10892.5.4.600.12.1.16.1.1",
    "protocol": "snmp"
  }
},
{
  "description": "Current of PSU 1 reported by BMC, in amp",
  "labels": {
    "psu": "1"
  },
  "name": "bmc_psu_current_amp",
  "optional_period": 5000,
  "period": 0,
  "scale_factor": 0.1,
  "source": {
    "id": "1.3.6.1.4.1.674.10892.5.4.600.30.1.6.1.{{ 1.3.6.1.4.1.674.10892.5.4.600.30.1.8.1 == PS1 Current 1 }}",
    "protocol": "snmp"
  }
},</syntaxhighlight>
The presence of an <code>optional_period</code> field indicates that the associated metrics can be activated “on demand”. For the <code>wattmetre_power_watt</code> metric, the <code>period</code> field is <code>1000</code> meaning that by default the Wattmetre gets a measure every second. However, as the <code>optional_period</code> is <code>20</code>, measurements are performed every 20 milliseconds when the metric is “on-demand” activated. Metrics having a <code>period</code> of <code>0</code>, such as <code>bmc_psu_current_amp</code>, don’t perform any measurement by default. It needs to be activated to perform measurements every <code>optional_period</code> milliseconds (i.e., every 5 seconds in the case of <code>bmc_psu_current_amp</code> metric).


== Preliminary remarks ==
Enabling <code>on_demand</code> metrics must be done at reservation time, by providing <code>-t monitor=xxxx</code> option to <code>oarsub</code>. For instance, to enable <code>wattmetre_power_watt</code> high frequency monitoring:


* In the examples given in this part, we will use the Kwollect through the Grid'5000 API.
<pre>oarsub -r now -p taurus -t monitor='wattmetre_power_watt'</pre>
* In this scenario, you need to reserve one node and install some additional tools inside it. As you will require to be root, you can use ''sudo-g5k'' to get sudo rights, or use kadeploy to deploy your own environment. Then, you can install the required tools with the following command:
To enable monitoring of <code>bmc_psu_current_amp</code>:
apt update && apt install linux-cpupower sysbench
* The solutions are given in Python 3, can easily be copy/pasted to '''ipython3''' interpreter.
apt install ipython3


== Workload examples ==
<pre>oarsub -r now -p taurus -t monitor='bmc_psu_current_amp'</pre>
The <code>-t monitor</code> option accepts regular expressions matching metrics name. For example, you can enable all “on-demand” metrics using:


We will consider 3 different workloads:
<pre>oarsub -r now -p taurus -t monitor='.*'</pre>
# '''Idle:''' Nothing is done of the machine
If you look at metrics at
# '''CPU Intensive, mono-threaded:''' The machine run a CPU intensive application on one of its core. We will use the "sysbench" benchmarking tool to mimic this workload, invoked with: <syntaxhighlight lang="bash">sysbench --test=cpu --cpu-max-prime=50000 --num-threads=1 run</syntaxhighlight>
# '''CPU Intensive, multi-threaded:''' The machine run a CPU intensive application on all of its core. We will also use "sysbench", invoked with: <syntaxhighlight lang="bash">NUM_THREADS=$(getconf _NPROCESSORS_ONLN)
sysbench --test=cpu --cpu-max-prime=50000 --num-threads=$NUM_THREADS run</syntaxhighlight>
(<code>$NUM_THREAD</code> is the number of threads to run, we will use the number of cores avaible on the node we use)


== Impact of CPU parameters ==
<pre>https://api.grid5000.fr/stable/sites/lyon/metrics?job_id=MY_JOB_ID</pre>
you will see more metrics than before, especially from Wattmetres.


Several CPU parameters tries are available to lower energy consumed under certain workload. In particular:
<span id="advanced-case-find-energy-consumption-for-individual-power-supply"></span>
* C-States configuration is the ability for processors and cores to go to energy saver "sleep states" when not being used.
== Advanced case: find energy consumption for individual power supply ==
* P-States policy dynamically adjusts voltage and frequency of cores to fit workload
* Turboboost allows cores to run at higher frequency while they stay under temperature specification limits.


In this example scenario, we will investigate two different C-States configuration : ''Partially enabled'' (the maximum authorized sleep state is C1, this is the default on Grid'5000) and ''fully enabled'' (all sleep states are allowed, the deeper sleep state on modern machine is usually C6). To change the maximum allowed sleep state allowed, we will use cpupower command. For instance, to allow all sleep states available, use:
Most Grid’5000 nodes have several PSU to power them and several monitoring devices -one per PSU- are needed to monitor the power used by the entire node. They are some situations where you need to get metrics associated with each PSU separately and process them yourself. The two most common situations are:


cpupower idle-set -E
* Get power consumption from PDU: when Wattmetres are used, measurements on each Wattmetre are summed-up to provide the ''wattmetre_power_watt'' metric available on the node. But this automatic sum cannot be done for PDUs power values and metrics must be retrieved from each PDU delivering power to node PSUs.
* Get power consumption for nodes sharing the same blade: Some Grid’5000 nodes are physically organized in groups of 2 or 4 that share the same server frame or ''blade''. PSUs belong to the blade and are therefore shared by the nodes grouped in the same blade. It would make no sense to provide a power consumption metric associated with a single node from these shared PSUs.


To disable sleep states that would require more than 20 microseconds to be awakened from it (i.e. disable C-States higher than C1):
In such cases, where monitoring of PSUs is available but no meaningful power consumption metric can be associated with an individual node, it may be still interesting to get metrics associated with each PSU separately and process them yourself.


cpupower idle-set -D 20
For instance, ''chuc'' cluster at Lille is composed of blades with two nodes each. Thus, <code>chuc-1</code> and <code>chuc-2</code> share the same PSUs, as well as <code>chuc-3</code> and <code>chuc-4</code>, etc.


We will also study the impact of turboboost by enabling (which is the default on Grid'5000) or disabling it. To disable turboboost, the following command must be used:
To retrieve the Wattmetres connected to these PDUs, it is possible to query the reference API for the specific node your interested in. For example, for <code>chuc-3</code>:


echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
https://api.grid5000.fr/stable/sites/lille/clusters/chuc/nodes/chuc-3


<!--
Under the <code>pdu</code> entry, you will find 6 “wattmetre” entries, meaning that <code>chuc-3</code> uses 6 PSUs monitored by Wattmetres. For each Wattmetre, the <code>uid</code> and <code>port</code> inform you about the Wattmetre device identifier that monitors each PSU.
== Internal hardware counter ==


Recent system include hardware counter that reports energy consumed by various components (cores, entire CPU, memory, GPU, etc).
<syntaxhighlight lang="json">"pdu": [
(...)
{
  "kind": "wattmetre-only",
  "port": 12,
  "uid": "wattmetrev3-1"
},
{
  "kind": "wattmetre-only",
  "port": 13,
  "uid": "wattmetrev3-1"
},
{
  "kind": "wattmetre-only",
  "port": 14,
  "uid": "wattmetrev3-1"
},
{
  "kind": "wattmetre-only",
  "port": 15,
  "uid": "wattmetrev3-1"
},
{
  "kind": "wattmetre-only",
  "port": 16,
  "uid": "wattmetrev3-1"
},
{
  "kind": "wattmetre-only",
  "port": 17,
  "uid": "wattmetrev3-1"
}
]</syntaxhighlight>
You can check that <code>chuc-4</code> has exactly the same identifiers at:


perf stat -e power/energy-ram/,power/energy-gpu/,power/energy-pkg/ -a -- sleep 10
https://api.grid5000.fr/stable/sites/lille/clusters/chuc/nodes/chuc-4


(TODO : do something useful)
which means that Wattmetres (and PSUs) are shared between these two nodes.
-->


== Scenario implementation ==
Finally, you can retrieve values for all Wattmetres attached to <code>chuc-3</code> and <code>chuc-4</code> PSUs by querying the Wattemetre identifier they are connected to. For instance, the first Wattmetre has a <code>port</code> equals to <code>12</code> and its <code>uid</code> is <code>wattmetrev3-1</code>. This means that the corresponding Wattmetre identifier is <code>wattmetrev3-1-port12</code>.


We propose to study following metrics:
It is thus possible to retrieve the power consumption of every PSUs of <code>chuc-3</code> and <code>chuc-4</code> blade using a query that looks like:
* Average electrical power required to run workload
* Time needed to run CPU workload
* The ops per watt value, i.e. the average number of operation per second and per Watt, a metric reflecting the "energy efficiency" of machines


The average electrical power required to run the workload is the amount of electrical energy spent during its execution divided by the execution time. Its value can be approximated as the average of the power values which have been monitored during execution.
https://api.grid5000.fr/stable/sites/lille/metrics?devices=wattmetrev3-1-port12,wattmetrev3-1-port13,wattmetrev3-1-port14,wattmetrev3-1-port15,wattmetrev3-1-port16,wattmetrev3-1-port17


Using your favorite programming language, write a function that queries the Grid'5000 API to return the average power used by a Grid'5000 node between two dates (as Unix timestamps).
The <code>devices</code> parameter has the same effect as <code>nodes</code> parameter seen before.


<span id="practical-study"></span>
== Practical study ==


<span class="mw-customtoggle-1" style="color:#0000ff">Solution (in Python)</span>
We now invite you to do a practical exercise to apply what you’ve learned. It consists in a study of the energy cost of a matrix multiplication made with Pytorch.
<div class="mw-collapsible mw-collapsed" id="mw-customcollapsible-1">


import requests
Reserve a node on Grid’5000 and execute the following commands to set up your environment:
# you may need to install requests for python3 with sudo-g5k apt install python3-requests
from statistics import mean
def get_power(node, site, start, stop, metric="wattmetre_power_watt"):
    url = "https://api.grid5000.fr/stable/sites/%s/metrics?metrics=%s&nodes=%s&start_time=%s&end_time=%s" \
            % (site, metric, node, int(start), int(stop))
    data = requests.get(url, verify=False).json()
    return sum(item['value'] for item in data)/len(data)


</div>
<pre>python -m venv monitoring_venv
<br/>
source monitoring_venv/bin/activate
module load cuda
pip3 install torch requests matplotlib
export REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt</pre>


=== Idle workload ===
Copy / paste this code snippet into a <code>monitoring_tutorial.py</code>.


First, we are going to investigate how C-States influence energy consumed when
<span class="mw-customtoggle-1" style="color:#0000ff">Click to expand!</span>
machine is idle.
<div class="mw-collapsible mw-collapsed" id="mw-customcollapsible-1">


Turn off C-States and leave the machine idle. What is the energy consumed
<syntaxhighlight lang="python">
during the last ten seconds ? Turn on C-States and repeat. How many Watts have
been saved by C-States ?


import torch
import time
import socket
import requests
import matplotlib.pyplot as plt


<span class="mw-customtoggle-2" style="color:#0000ff">Solution</span>
<div class="mw-collapsible mw-collapsed" id="mw-customcollapsible-2">


from os import system
def main():
from time import sleep, time
    results = {}
    num_threads = [1, 2, 4, 8, 16, 32]
# Turn off C-States
 
system("sudo cpupower idle-set -D0")
    for c in num_threads:
sleep(20)
        start_time, end_time, duration, _ = perform_matrix_multiplication(num_threads=c)
power_cstate_off = get_power("nova-6", "lyon", time()-20, time()-10)
   
        results[c] = {}
# Turn on C-States
        results[c]["duration"] = duration
system("sudo cpupower idle-set -E")
sleep(20)
power_cstate_on = get_power("nova-6", "lyon", time()-20, time()-10)
print(power_cstate_off - power_cstate_on)


</div>
        values = get_metrics_from_kwollect(start_time=start_time, end_time=end_time, metric="wattmetre_power_watt")
<br/>
        results[c]["energy_wattmetre"] = get_energy_from_metrics(values, duration)


=== CPU intensive, mono-threaded, workload ===
        values = get_metrics_from_kwollect(start_time=start_time, end_time=end_time, metric="bmc_node_power_watt")
        results[c]["energy_bmc"] = get_energy_from_metrics(values, duration)


We are now going to run CPU intensive workload and see how CPU parameters
    plot_results(results, "monitoring_tutorial.png")
influence the average power consumption but also the time spent to execute the
workload.


For instance, turn off C-States and Turboboost and measure the workload
runtime, and then get the average power consumed. Repeat with C-States turned
on, with or without Turboboost. Which combination consumes less power ? Which
one runs faster ? has the best ops/watt ratio ?


def perform_matrix_multiplication(num_threads=None):
    if num_threads is not None:
        num_threads_init = torch.get_num_threads()
        torch.set_num_threads(num_threads)
    N=2048
    A = torch.randn(N, N, device="cpu")
    B = torch.randn(N, N, device="cpu")
    count = 0
    start_time = time.time()
    while time.time() - start_time < 10:
        C = A @ B
        count += 1
    end_time = time.time()
    duration = (end_time - start_time)/count
    print(f"Matrix multiplaction duration: {duration} seconds ({count} multiplications performed)")
    if num_threads is not None:
        torch.set_num_threads(num_threads_init)
    return start_time, end_time, duration, count


<span class="mw-customtoggle-3" style="color:#0000ff">Solution</span>
<div class="mw-collapsible mw-collapsed" id="mw-customcollapsible-3">
<pre>
from os import system
from time import sleep, time


# Turn off C-States and Turboboost
def plot_results(results, outfile):
system("sudo cpupower idle-set -D0")
    num_threads = sorted(results.keys())
system("echo 1 | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo")
    fig, ax1 = plt.subplots(figsize=(8, 8))
    ax1.set_title("Matrix Multiplication Duration & Energy")
    ax1.set_xlabel("Number of threads used")


# Run workload
    ax1.set_ylabel("Duration (seconds)", color="orange")
start = time()
    ax1.bar(num_threads, [results[c]["duration"] for c in num_threads], color="orange")
system("sysbench --test=cpu --cpu-max-prime=20000 run")
stop = time()


# Get results
    ax2 = ax1.twinx()
sleep(5)
    ax2.set_ylabel("Energy (joules)")
power = get_power("nova-6", "lyon", start, stop)
    ax2.plot(num_threads, [results[c]["energy_wattmetre"] for c in num_threads], "+-", color="green", label="wattmetre")
result_1 = "C-States OFF, Turbo OFF, Duration: %f, Power: %f" % (stop-start, power)
    ax2.plot(num_threads, [results[c]["energy_bmc"] for c in num_threads], "+-", color="blue", label="BMC")
    ax2.legend()


    plt.savefig(outfile)


# Turn on C-States
system("sudo cpupower idle-set -E")


# Run workload
def get_metrics_from_kwollect(start_time, end_time, metric, site=None, node=None):
start = time()
    if node is None:
system("sysbench --test=cpu --cpu-max-prime=20000 run")
        node = socket.getfqdn().split(".")[0]
stop = time()
    if site is None:
        site = socket.getfqdn().split(".")[1]


# Get results
    kwollect_url = f"https://api.grid5000.fr/stable/sites/{site}/..." #FIXME
sleep(5)
    print(f"Requesting Kwollect at {kwollect_url}")
power = get_power("nova-6", "lyon", start, stop)
    metrics = requests.get(kwollect_url).json()
result_2 = "C-States ON, Turbo OFF, Duration: %f, Power: %f" % (stop-start, power)


    return metrics


# Turn on Turboboost
def get_energy_from_metrics(power_metrics, duration):
system("echo 0 | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo")
    average_power = sum(-1)/len([-1]) #FIXME
    energy = average_power * 0 #FIXME
    return energy


# Run workload
start = time()
system("sysbench --test=cpu --cpu-max-prime=20000 run")
stop = time()


# Get results
if __name__ == "__main__":
sleep(5)
    main()
power = get_power("nova-6", "lyon", start, stop)
result_3 = "C-States ON, Turbo ON, Duration: %f, Power: %f" % (stop-start, power)


# Print results
</syntaxhighlight>
print(result_1)
print(result_2)
print(result_3)
</pre>
</div>
</div>
<br/>
<br/>


=== CPU intensive, multi-threaded, workload ===
The goal of the script is to measure duration and energy consumed when performing matrix multiplications while using a different number of threads. The script is composed as follows:


We are now going to repeat the same experiment with a multi-threaded workload,
* The <code>main()</code> function implements the script logic: looping over a number of threads, perform the matrix multiplication, get metrics from Kwollect and finally plot the results under “monitoring_tutorial.png” file.
running on every cores the machine has. Run the workload with or without
* The <code>perform_matrix_multiplication(num_threads)</code> function implements the matrix multiplication
C-States and Turboboost activated and observe runtime and power consumed. What
* The <code>plot_results(results, outfile)</code> function implements plotting of the results
can you say abount the influence of CPU parameters on multi-threaded, CPU
* The <code>get_metrics_from_kwollect(start_time, end_time, metric, site=None, node=None)</code> is used to fetch the values for <code>metric</code> between <code>start_time</code> and <code>stop_time</code> period. (if <code>node</code> and <code>site</code> parameters are not provided, they will be derived from the machine where the script is executed)
intensive workload ? Is running multi-threaded is more energy efficient ?
* The <code>get_energy_from_metrics(power_metrics, duration)</code> will compute the energy consumed under <code>duration</code> from <code>power_metrics</code> received from Kwollect


The latter two functions are incomplete. You must replace lines containing “FIXME” comments with the appropriate code to make the function work as expected.


<span class="mw-customtoggle-4" style="color:#0000ff">Solution</span>
Once done, you can transfer the “monitoring_tutorial.png” file to your local machine to visualize it. You should be able to answer questions such as:
<div class="mw-collapsible mw-collapsed" id="mw-customcollapsible-4">
<pre>
from os import system
from time import sleep, time
import requests


# Get core count
* How many cores you should use to get the fastest matrix multiplication?
core_count = requests.get(
* Is it more energy efficient to use less cores to consume less energy?
                "https://api.grid5000.fr/stable/sites/lyon/clusters/nova/nodes/nova-1",
* …
                verify=False
                ).json()['architecture']['nb_cores']


# Turn off C-States and Turboboost
=== Solution ===
system("sudo cpupower idle-set -D0")
system("echo 1 | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo")


# Run workload
Below are the completed functions that implement this exercise:
start = time()
system("sysbench --test=cpu --cpu-max-prime=50000 --num-threads=%s run" % core_count)
stop = time()


# Get results
<span class="mw-customtoggle-2" style="color:#0000ff">Click to expand!</span>
sleep(5)
<div class="mw-collapsible mw-collapsed" id="mw-customcollapsible-2">
power = get_power("nova-6", "lyon", start, stop)
result_1 = "C-States OFF, Turbo OFF, Duration: %f, Power: %f" % (stop-start, power)


<syntaxhighlight lang="python">


# Turn on C-States
def get_metrics_from_kwollect(start_time, end_time, metric, site=None, node=None):
system("sudo cpupower idle-set -E")
    if node is None:
        node = socket.getfqdn().split(".")[0]
    if site is None:
        site = socket.getfqdn().split(".")[1]


# Run workload
    kwollect_url = f"https://api.grid5000.fr/stable/sites/{site}/metrics?nodes={node}&start_time={start_time}&end_time={end_time}&metrics={metric}"
start = time()
    print(f"Requesting Kwollect at {kwollect_url}")
system("sysbench --test=cpu --cpu-max-prime=50000 --num-threads=%s run" % core_count)
    metrics = requests.get(kwollect_url).json()
stop = time()


# Get results
    return metrics
sleep(5)
power = get_power("nova-6", "lyon", start, stop)
result_2 = "C-States ON, Turbo OFF, Duration: %f, Power: %f" % (stop-start, power)


def get_energy_from_metrics(power_metrics, duration):
    average_power = sum(x["value"] for x in power_metrics)/len(power_metrics)
    energy = average_power * duration
    return energy


# Turn on Turboboost
</syntaxhighlight>
system("echo 0 | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo")
</div>
<br/>


# Run workload
A <code>monitoring_tutorial.png</code> file should be generated and looks like this (on a <code>taurus</code> node):
start = time()
system("sysbench --test=cpu --cpu-max-prime=50000 --num-threads=%s run" % core_count)
stop = time()


# Get results
<span class="mw-customtoggle-4" style="color:#0000ff">Click to expand!</span>
sleep(5)
<div class="mw-collapsible mw-collapsed" id="mw-customcollapsible-4">
power = get_power("nova-6", "lyon", start, stop)
 
result_3 = "C-States ON, Turbo ON, Duration: %f, Power: %f" % (stop-start, power)
<center>
[[File:Monitoring_tutorial.png]]
</center>


# Print results
print(result_1)
print(result_2)
print(result_3)
</pre>
</div>
</div>
<br/>
<br/>


= Going further =
If you want to go further, you can enhance the script to implement following features (in increasing order of difficulty):
* The various monitoring devices used in Grid'5000 are presented in this page: [[Power Monitoring Devices]]
 
* More details about Grid'5000 monitoring capabilities with Kwollect are available at: [[Monitoring Using Kwollect]]
* Reserve a node with a GPU and add a case where the matrix multiplication is performed on a GPU (you can use a special “gpu” value in num_threads list).
* More information about modifying CPU parameters on Grid'5000: [[CPU parameters]]
* Using Alumet, add energy consumption measured by RAPL (take care of the RAPL domain returned in metrics, for instance you could only use “PSys” if available to get an approximation of the whole node consumption that can be compared to other values).
* More information about Grid'5000 API: [[API]]
* Using Alumet, add GPU consumption using NVML.
* For more experiment scripting in Python, see [[Execo Practical Session]]
 
<span id="conclusion"></span>
== Conclusion ==
 
The tutorial is now finished. You should have learned most of what you need to know to monitor electrical energy consumption in your Grid’5000 experiments.
 
If you need additional information about monitoring under Grid’5000 (not specific to power), see the documentation at [[Monitoring_Using_Kwollect]]. Feel free to share suggestions or report any problem at mailto:users@lists.grid5000.fr.

Latest revision as of 17:04, 27 August 2025

Note.png Note

This page is actively maintained by the Grid'5000 team. If you encounter problems, please report them (see the Support page). Additionally, as it is a wiki page, you are free to make minor corrections yourself if needed. If you would like to suggest a more fundamental change, please contact the Grid'5000 team.


Estimate duration: 90 minutes

Introduction

In this tutorial, you will learn how to monitor electrical energy consumption while experimenting on server-class machines under Grid’5000.

The tutorial will be organized into the following sections:

  • Monitoring devices available
  • Find monitoring features available on a node
  • Getting metrics values from Kwollect
  • Monitoring of internal metrics (e.g. RAPL)
  • Advanced case: enable high-frequency monitoring on Wattmetres and other on-demand metrics
  • Advanced case: find energy consumption for individual power supply
  • Practical study

The first four sections explain the basics of energy monitoring under Grid’5000 and should not be skipped. The two “Advanced” sections are optional. The “Practical study” is an exercise to put into practice what you have learned.

Monitoring devices available

Grid’5000 provides access to various monitoring devices that measure the electrical power consumed by nodes, such as:

  • “Wattmetres” which are specialized devices localized between a node power supply and its power source and able to perform up to 50 measurements per second, with a high relative precision.

    The current generation of Wattmetres installed on the infrastructure is made by ADECWatts company.

    Wattmetres are located on the left side of the rack

    (Wattmetres are located on the left side of the rack)

  • PDU (Power Delivery Units), which are the most common way to deliver electrical power to server-class nodes used in Grid’5000, may also export energy monitoring metrics. However, they are less precise than Wattmetres.

    PDU on the left used to power servers

    (PDU on the left is used to power servers)

  • BMC (Baseboard Management Controller) are control units placed inside a server chassis, while remaining independent of the rest of the system. They also export energy monitoring metrics, but are even less precise than PDU and thus, Wattmetres (in particular, they are localized downstream from the power supply and therefore cannot take into account its energy losses).

In addition, individual components inside a node may provide energy monitoring. This is typically the case for CPU and GPU, which provide energy consumed by various internal parts by exposing internal hardware counters accessible through a dedicated interface, such as RAPL for Intel & AMD CPUs and NVML for NVIDIA GPUs. See these references for more information: 1 2 3.

⚠️ Warning ⚠️: Monitoring metrics are not always reliable: the monitoring device may fail and report wrong values; PDU and BMC often report inaccurate values (e.g., updated at a low frequency, heavily smoothed, using a moving average, etc.). It is strongly recommended to cross-check your measurements by using different monitoring devices (e.g., both Wattmetres and BMC) to ensure confidence in results.

Find monitoring features available on a node.

Monitoring, like everything else on Grid’5000, is documented in the Grid’5000 Reference API. Let’s see how to query this API to discover what monitoring devices are available for a particular node.

Monitoring capabilities are described in terms of metrics available on a cluster. To get the list of all metrics available for a cluster, the API can be queried at this address:

https://api.grid5000.fr/stable/sites/<SITE>/clusters/<CLUSTER>

For instance, if you are interested in metrics available on taurus-12 node at Lyon, you can query the following URL:

https://api.grid5000.fr/stable/sites/lyon/clusters/taurus

Metrics are described under the metrics entry of the JSON document returned by the command. To get a better view of the metrics list, you can use a command such as:

curl https://api.grid5000.fr/stable/sites/lyon/clusters/taurus | jq '.metrics' | less

As you can see, many metrics are available, related to energy monitoring or not. More information about general monitoring in Grid’5000, including the full list of available metrics, is available in Monitoring Using Kwollect documentation.

We will focus on two metrics for this tutorial: wattmetre_power_watt and bmc_node_power_watt. The Reference API description of these metrics looks like this:

{
 "description": "Power consumption of node reported by Wattmetre, in watt",
 "name": "wattmetre_power_watt",
 "optional_period": 20,
 "period": 1000,
 "source": {
   "protocol": "wattmetre"
 }
},
{
 "description": "Power consumption of node reported by BMC, in watt",
 "name": "bmc_node_power_watt",
 "period": 5000,
 "source": {
   "id": "1.3.6.1.4.1.674.10892.5.4.600.30.1.6.1.{{ 1.3.6.1.4.1.674.10892.5.4.600.30.1.8.1 == System Board Pwr Consumption }}",
   "protocol": "snmp"
 }
}
  • The description field explains the nature of the monitoring devices, as well as the physical unit of the measure.
  • The name field is the metric’s identifier used throughout the monitoring system.
  • The period field describes the interval, in milliseconds, between two consecutive measurements performed on the monitoring device. (Note that this does not necessarily correspond to the frequency with which the device itself update its internal value. It can be larger, especially on BMC, as said in the Warning section above).
  • The optional_period field, only available on the wattmetre_power_watt metric, indicates that this device can be configured to perform even more frequent measurements every 20 ms (i.e., at 50 Hz), on user’s demand (more on that later).
  • The source field indicates the protocol used to query the monitoring device and should not be of much interest to you.

Getting metrics values from Kwollect

Once you have identified the nodes and the metrics you are interested in, you can simply query Kwollect, the monitoring system used in Grid’5000, to retrieve metrics values over time. For instance, to get wattmetre_power_watt and bmc_node_power_watt metrics values for taurus-4 and taurus-5 between 10:00 and 10:10 the 1st of May 2025, you can query the API at:

https://api.grid5000.fr/stable/sites/lyon/metrics?nodes=taurus-4,taurus-5&metrics=wattmetre_power_watt,bmc_node_power_watt&start_time=2025-05-01T10:00&end_time=2025-05-01T10:10

This will return a JSON document like:

[{"timestamp":"2025-05-01T10:00:00+02:00","device_id":"taurus-5","metric_id":"wattmetre_power_watt","value":6.052631578947369,"labels":{"_device_orig": ["wattmetre1-port4"]}},
 {"timestamp":"2025-05-01T10:00:00+02:00","device_id":"taurus-4","metric_id":"wattmetre_power_watt","value":6.332432432432432,"labels":{"_device_orig": ["wattmetre1-port3"]}},
 {"timestamp":"2025-05-01T10:00:00.654002+02:00","device_id":"taurus-5","metric_id":"bmc_node_power_watt","value":0,"labels":{}},
 {"timestamp":"2025-05-01T10:00:00.654322+02:00","device_id":"taurus-4","metric_id":"bmc_node_power_watt","value":0,"labels":{}},
 {"timestamp":"2025-05-01T10:00:01+02:00","device_id":"taurus-4","metric_id":"wattmetre_power_watt","value":6.070731707317074,"labels":{"_device_orig": ["wattmetre1-port3"]}},
 {"timestamp":"2025-05-01T10:00:01+02:00","device_id":"taurus-5","metric_id":"wattmetre_power_watt","value":6.239024390243902,"labels":{"_device_orig": ["wattmetre1-port4"]}},
...

where each line corresponds to a single measurement.

It is also possible to get all metrics associated to a Grid’5000 reservation by providing OAR job number:

https://api.grid5000.fr/stable/sites/lyon/metrics?job_id=1899135

This will return all metrics from all nodes belonging to the reservation, but you can filter by using the nodes and metrics parameters.

A graphical dashboard is also available to visualize metrics. It is available at:

https://api.grid5000.fr/stable/sites/lyon/metrics/dashboard

You can change lyon with the site you need.

It can be noted that metrics stored in Kwollect are kept indefinitely.

Monitoring of internal metrics

We call “internal metrics” the metrics available from inside the node operating system, i.e., that you can fetch yourself as a user, unlike metrics fetched from external devices, such as Wattmetres, provided by the infrastructure. This kind of metrics includes RAPL for CPU energy consumption, NVML form GPU consumption, but also any kind of metrics available from the system, such CPU or IO usage.

As many tools are available to get internal metrics, we assume that you will want to use the one that’s best fit your needs. We will explain a generic way to push metrics to Kwollect, so it can be adapted whatever the tool used. We will also introduce Alumet usage, a convenient tool to fetch internal metrics which has a “native” Kwollect export feature.

In any case, you will be able to access all your metrics, both internal and external from devices such as Wattmetres and BMC, through the same API using Kwollect.

Pushing metrics to Kwollect

It is possible to push metrics to Kwollect, from inside a node, by performing a POST request to following API endpoint:

https://api.grid5000.fr/stable/sites/SITE/metrics

The request must include the list of metrics to be inserted, formatted as a JSON like:

[{"metric_id": "METRIC_NAME1", "value": VALUE1}, {"metric_id": "METRIC_NAME2", "value": VALUE2}, ]

For each metric, a timestamp value can optionally be provided (otherwise, the current time will be used as the metric’s timestamp). The device_id field can also be given (if it corresponds to a node under reservation by the user making the request), otherwise, the node from which the request originates will be used. Finally, a labels field can be added to provide arbitrary metadata formatted as JSON.

As an example, this little shell script shows how to use this feature from a reserved node. Each second, it will fetch the energy consumed by CPU cores from RAPL using “Linux Perf” tool and push the resulting values to Kwollect:

while true; do
  echo "Fetching power consumption by CPU cores using RAPL"
  V=$(sudo-g5k perf stat -e power/energy-cores/ -x"," sleep 1 2>&1 | grep Joules | cut -d',' -f1)
  echo "Average power during last second: $V W, pushing to Kwollect"
  curl https://api.grid5000.fr/stable/sites/lyon/metrics -X POST -H 'content-type: application/json' -d '{"metric_id": "my_cores_power_watt", "value": '$V'}'
  sleep 1
done

The my_cores_power_watt metric values will be available as usual from Kwollect, e.g., by requesting at:

https://api.grid5000.fr/stable/sites/lyon/metrics?job_id=MY_JOB_ID,metrics=my_cores_power_watt

Using Alumet (Adaptive, Lightweight, Unified Metrics)

Alumet is a versatile monitoring tool that provides a generic measurement pipeline with three steps: poll measurement sources, transform the data, and write the result. It is designed to be able to ingest metrics from various sources without redundant work. Supported sources include RAPL domains, Nvidia’s NVML, and Jetson INA sensors.

Alumet can be configured to monitor internal metrics of a Grid’5000 and export them to Kwollect using the “push feature” described above.

We are going to present an example of this joint use of Alumet and Kwollect for energy monitoring: Alumet is used to monitor RAPL metrics and exports them to Kwollect. Then, you will be able, by querying Kwollect API, to compare measurements from RAPL to external monitoring devices provided by Wattmetres and BMC.

First, Alumet needs to be installed on the reserved Grid’5000 node. Kwolllect support currently requires the latest Git version, to make things easier, the binary is available under Grid’5000 at: http://public.lyon.grid5000.fr/~sdelamare/alumet-agent. You can execute the following commands to get Alumet:

wget http://public.lyon.grid5000.fr/~sdelamare/alumet-agent
chmod +x alumet-agent

Then, we will use a alumet-config.toml configuration file to setup the rapl input plugin and the kwollect-output plugin with the following content:

[plugins.rapl]
poll_interval = "1s"
flush_interval = "5s"
no_perf_events = false

[plugins.kwollect-output]
url = "https://api.grid5000.fr/stable/sites/SITE/metrics"
append_unit_to_metric_name = true
use_unit_display_name = false

Remind replacing SITE in the URL entry by the site where your reserved node is located.

To access to RAPL metrics, we need a privileged configuration that must be setup using:

sudo-g5k
sudo sysctl -w kernel.perf_event_paranoid=0

Finally, run Alumet with:

./alumet-agent --config alumet-config.toml --plugins rapl,kwollect-output run

As specified in the configuration file, this will fetch RAPL metrics every second and push them to Kwollect.

The name of the RAPL metrics used by Alumet is “rapl_consumed_energy_J” (RAPL indeed performs energy measurements and the units used are Joules). You can look at these metrics by querying:

https://api.grid5000.fr/stable/sites/lyon/metrics?nodes=taurus-11&metrics=rapl_consumed_energy_J&start_time=2025-07-03T12:15

(replace lyon, taurus-11 by what is appropriate for you. If start_time is omitted, the metrics from last 5 minutes will be returned).

A single metric looks like this:

{
  "timestamp": "2025-07-03T12:19:59.846832+02:00",
  "device_id": "taurus-11",
  "metric_id": "rapl_consumed_energy_J",
  "value": 0.87835693359375,
  "labels": {
    "domain": "pp0",
    "consumer_id": "",
    "_insert_user": "sdelamare",
    "ressource_id": "0",
    "__insert_time": 1751538003.850289,
    "consumer_kind": "local_machine",
    "ressource_kind": "cpu_package"
  }
}

Pay attention to labels content. It provides information about the specific RAPL domain associated to this particular measurement. In this case, the "domain": "pp0" entry means that this measure is the energy consumed by CPU’s cores and "ressource_id": "0" means that it only applies to the first CPU of the node.

Finally, to get metrics from both external monitoring devices and RAPL, you can perform a query such as:

https://api.grid5000.fr/stable/sites/lyon/metrics?nodes=taurus-11&metrics=wattmetre_power_watt,bmc_node_power_watt,rapl_consumed_energy_J&start_time=2025-07-03T12:15

But take care when comparing measures from RAPL and from Wattmetre or BMC: - RAPL measurements only concern a specific component of the system (CPU, DRAM, etc.), except for the “PSys” domain, which should encompass the whole system but which is loosely specified and only available on some recent hardware - The measurements from Wattmetres and BMC are power measurements, representing an average power usage over a period of time (one second by default for Wattmetres). The RAPL measurement reported by Alumet represents the total energy consumption during the period of time between the previous measurement and the current one. Remember that one Joule of energy is corresponding to a power usage of one Watt during one second.

It can be noted that Alumet provides other modules to get consumption from GPU using NVML or from NVIDIA Jetson, and for other kind of metrics (CPU usage…)

Advanced case: Enable high-frequency monitoring on Wattmetres and others on-demand metrics

Some metrics are not monitored by default, or at a lower frequency. Let’s go back to the metrics description in the Reference API:

curl https://api.grid5000.fr/stable/sites/lyon/clusters/taurus | jq '.metrics' | less
{
 "description": "Power consumption of node reported by Wattmetre, in watt",
 "name": "wattmetre_power_watt",
 "optional_period": 20,
 "period": 1000,
 "source": {
   "protocol": "wattmetre"
 }
},
{
 "description": "Power consumption of node reported by BMC, in watt",
 "name": "bmc_node_power_watt",
 "period": 5000,
 "source": {
   "id": "1.3.6.1.4.1.674.10892.5.4.600.30.1.6.1.{{ 1.3.6.1.4.1.674.10892.5.4.600.30.1.8.1 == System Board Pwr Consumption }}",
   "protocol": "snmp"
 }
},
{
  "description": "Voltage of PSU 1 reported by BMC, in volt",
  "labels": {
    "psu": "1"
  },
  "name": "bmc_psu_voltage_volt",
  "optional_period": 5000,
  "period": 0,
  "source": {
    "id": "1.3.6.1.4.1.674.10892.5.4.600.12.1.16.1.1",
    "protocol": "snmp"
  }
},
{
  "description": "Current of PSU 1 reported by BMC, in amp",
  "labels": {
    "psu": "1"
  },
  "name": "bmc_psu_current_amp",
  "optional_period": 5000,
  "period": 0,
  "scale_factor": 0.1,
  "source": {
    "id": "1.3.6.1.4.1.674.10892.5.4.600.30.1.6.1.{{ 1.3.6.1.4.1.674.10892.5.4.600.30.1.8.1 == PS1 Current 1 }}",
    "protocol": "snmp"
  }
},

The presence of an optional_period field indicates that the associated metrics can be activated “on demand”. For the wattmetre_power_watt metric, the period field is 1000 meaning that by default the Wattmetre gets a measure every second. However, as the optional_period is 20, measurements are performed every 20 milliseconds when the metric is “on-demand” activated. Metrics having a period of 0, such as bmc_psu_current_amp, don’t perform any measurement by default. It needs to be activated to perform measurements every optional_period milliseconds (i.e., every 5 seconds in the case of bmc_psu_current_amp metric).

Enabling on_demand metrics must be done at reservation time, by providing -t monitor=xxxx option to oarsub. For instance, to enable wattmetre_power_watt high frequency monitoring:

oarsub -r now -p taurus -t monitor='wattmetre_power_watt'

To enable monitoring of bmc_psu_current_amp:

oarsub -r now -p taurus -t monitor='bmc_psu_current_amp'

The -t monitor option accepts regular expressions matching metrics name. For example, you can enable all “on-demand” metrics using:

oarsub -r now -p taurus -t monitor='.*'

If you look at metrics at

https://api.grid5000.fr/stable/sites/lyon/metrics?job_id=MY_JOB_ID

you will see more metrics than before, especially from Wattmetres.

Advanced case: find energy consumption for individual power supply

Most Grid’5000 nodes have several PSU to power them and several monitoring devices -one per PSU- are needed to monitor the power used by the entire node. They are some situations where you need to get metrics associated with each PSU separately and process them yourself. The two most common situations are:

  • Get power consumption from PDU: when Wattmetres are used, measurements on each Wattmetre are summed-up to provide the wattmetre_power_watt metric available on the node. But this automatic sum cannot be done for PDUs power values and metrics must be retrieved from each PDU delivering power to node PSUs.
  • Get power consumption for nodes sharing the same blade: Some Grid’5000 nodes are physically organized in groups of 2 or 4 that share the same server frame or blade. PSUs belong to the blade and are therefore shared by the nodes grouped in the same blade. It would make no sense to provide a power consumption metric associated with a single node from these shared PSUs.

In such cases, where monitoring of PSUs is available but no meaningful power consumption metric can be associated with an individual node, it may be still interesting to get metrics associated with each PSU separately and process them yourself.

For instance, chuc cluster at Lille is composed of blades with two nodes each. Thus, chuc-1 and chuc-2 share the same PSUs, as well as chuc-3 and chuc-4, etc.

To retrieve the Wattmetres connected to these PDUs, it is possible to query the reference API for the specific node your interested in. For example, for chuc-3:

https://api.grid5000.fr/stable/sites/lille/clusters/chuc/nodes/chuc-3

Under the pdu entry, you will find 6 “wattmetre” entries, meaning that chuc-3 uses 6 PSUs monitored by Wattmetres. For each Wattmetre, the uid and port inform you about the Wattmetre device identifier that monitors each PSU.

"pdu": [
(...)
{
  "kind": "wattmetre-only",
  "port": 12,
  "uid": "wattmetrev3-1"
},
{
  "kind": "wattmetre-only",
  "port": 13,
  "uid": "wattmetrev3-1"
},
{
  "kind": "wattmetre-only",
  "port": 14,
  "uid": "wattmetrev3-1"
},
{
  "kind": "wattmetre-only",
  "port": 15,
  "uid": "wattmetrev3-1"
},
{
  "kind": "wattmetre-only",
  "port": 16,
  "uid": "wattmetrev3-1"
},
{
  "kind": "wattmetre-only",
  "port": 17,
  "uid": "wattmetrev3-1"
}
]

You can check that chuc-4 has exactly the same identifiers at:

https://api.grid5000.fr/stable/sites/lille/clusters/chuc/nodes/chuc-4

which means that Wattmetres (and PSUs) are shared between these two nodes.

Finally, you can retrieve values for all Wattmetres attached to chuc-3 and chuc-4 PSUs by querying the Wattemetre identifier they are connected to. For instance, the first Wattmetre has a port equals to 12 and its uid is wattmetrev3-1. This means that the corresponding Wattmetre identifier is wattmetrev3-1-port12.

It is thus possible to retrieve the power consumption of every PSUs of chuc-3 and chuc-4 blade using a query that looks like:

https://api.grid5000.fr/stable/sites/lille/metrics?devices=wattmetrev3-1-port12,wattmetrev3-1-port13,wattmetrev3-1-port14,wattmetrev3-1-port15,wattmetrev3-1-port16,wattmetrev3-1-port17

The devices parameter has the same effect as nodes parameter seen before.

Practical study

We now invite you to do a practical exercise to apply what you’ve learned. It consists in a study of the energy cost of a matrix multiplication made with Pytorch.

Reserve a node on Grid’5000 and execute the following commands to set up your environment:

python -m venv monitoring_venv 
source monitoring_venv/bin/activate
module load cuda
pip3 install torch requests matplotlib
export REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt

Copy / paste this code snippet into a monitoring_tutorial.py.

Click to expand!

import torch
import time
import socket
import requests
import matplotlib.pyplot as plt


def main():
    results = {}
    num_threads = [1, 2, 4, 8, 16, 32]

    for c in num_threads:
        start_time, end_time, duration, _ = perform_matrix_multiplication(num_threads=c)
     
        results[c] = {}
        results[c]["duration"] = duration

        values = get_metrics_from_kwollect(start_time=start_time, end_time=end_time, metric="wattmetre_power_watt")
        results[c]["energy_wattmetre"] = get_energy_from_metrics(values, duration)

        values = get_metrics_from_kwollect(start_time=start_time, end_time=end_time, metric="bmc_node_power_watt")
        results[c]["energy_bmc"] = get_energy_from_metrics(values, duration)

    plot_results(results, "monitoring_tutorial.png")


def perform_matrix_multiplication(num_threads=None):
    if num_threads is not None:
        num_threads_init = torch.get_num_threads()
        torch.set_num_threads(num_threads)
    N=2048
    A = torch.randn(N, N, device="cpu")
    B = torch.randn(N, N, device="cpu")
    count = 0
    start_time = time.time()
    while time.time() - start_time < 10:
        C = A @ B
        count += 1
    end_time = time.time()
    duration = (end_time - start_time)/count
    print(f"Matrix multiplaction duration: {duration} seconds ({count} multiplications performed)")
    if num_threads is not None:
        torch.set_num_threads(num_threads_init)
    return start_time, end_time, duration, count


def plot_results(results, outfile):
    num_threads = sorted(results.keys())
    fig, ax1 = plt.subplots(figsize=(8, 8))
    ax1.set_title("Matrix Multiplication Duration & Energy")
    ax1.set_xlabel("Number of threads used")

    ax1.set_ylabel("Duration (seconds)", color="orange")
    ax1.bar(num_threads, [results[c]["duration"] for c in num_threads], color="orange")

    ax2 = ax1.twinx()
    ax2.set_ylabel("Energy (joules)")
    ax2.plot(num_threads, [results[c]["energy_wattmetre"] for c in num_threads], "+-", color="green", label="wattmetre")
    ax2.plot(num_threads, [results[c]["energy_bmc"] for c in num_threads], "+-", color="blue", label="BMC")
    ax2.legend()

    plt.savefig(outfile)


def get_metrics_from_kwollect(start_time, end_time, metric, site=None, node=None):
    if node is None:
        node = socket.getfqdn().split(".")[0]
    if site is None:
        site = socket.getfqdn().split(".")[1]

    kwollect_url = f"https://api.grid5000.fr/stable/sites/{site}/..." #FIXME
    print(f"Requesting Kwollect at {kwollect_url}")
    metrics = requests.get(kwollect_url).json()

    return metrics

def get_energy_from_metrics(power_metrics, duration):
    average_power = sum(-1)/len([-1]) #FIXME
    energy = average_power * 0 #FIXME
    return energy


if __name__ == "__main__":
    main()


The goal of the script is to measure duration and energy consumed when performing matrix multiplications while using a different number of threads. The script is composed as follows:

  • The main() function implements the script logic: looping over a number of threads, perform the matrix multiplication, get metrics from Kwollect and finally plot the results under “monitoring_tutorial.png” file.
  • The perform_matrix_multiplication(num_threads) function implements the matrix multiplication
  • The plot_results(results, outfile) function implements plotting of the results
  • The get_metrics_from_kwollect(start_time, end_time, metric, site=None, node=None) is used to fetch the values for metric between start_time and stop_time period. (if node and site parameters are not provided, they will be derived from the machine where the script is executed)
  • The get_energy_from_metrics(power_metrics, duration) will compute the energy consumed under duration from power_metrics received from Kwollect

The latter two functions are incomplete. You must replace lines containing “FIXME” comments with the appropriate code to make the function work as expected.

Once done, you can transfer the “monitoring_tutorial.png” file to your local machine to visualize it. You should be able to answer questions such as:

  • How many cores you should use to get the fastest matrix multiplication?
  • Is it more energy efficient to use less cores to consume less energy?

Solution

Below are the completed functions that implement this exercise:

Click to expand!

def get_metrics_from_kwollect(start_time, end_time, metric, site=None, node=None):
    if node is None:
        node = socket.getfqdn().split(".")[0]
    if site is None:
        site = socket.getfqdn().split(".")[1]

    kwollect_url = f"https://api.grid5000.fr/stable/sites/{site}/metrics?nodes={node}&start_time={start_time}&end_time={end_time}&metrics={metric}"
    print(f"Requesting Kwollect at {kwollect_url}")
    metrics = requests.get(kwollect_url).json()

    return metrics

def get_energy_from_metrics(power_metrics, duration):
    average_power = sum(x["value"] for x in power_metrics)/len(power_metrics)
    energy = average_power * duration
    return energy


A monitoring_tutorial.png file should be generated and looks like this (on a taurus node):

Click to expand!

Monitoring tutorial.png


If you want to go further, you can enhance the script to implement following features (in increasing order of difficulty):

  • Reserve a node with a GPU and add a case where the matrix multiplication is performed on a GPU (you can use a special “gpu” value in num_threads list).
  • Using Alumet, add energy consumption measured by RAPL (take care of the RAPL domain returned in metrics, for instance you could only use “PSys” if available to get an approximation of the whole node consumption that can be compared to other values).
  • Using Alumet, add GPU consumption using NVML.

Conclusion

The tutorial is now finished. You should have learned most of what you need to know to monitor electrical energy consumption in your Grid’5000 experiments.

If you need additional information about monitoring under Grid’5000 (not specific to power), see the documentation at Monitoring_Using_Kwollect. Feel free to share suggestions or report any problem at mailto:users@lists.grid5000.fr.