Grid5000:UsagePolicy: Difference between revisions

From Grid5000
Jump to navigation Jump to search
(25 intermediate revisions by 6 users not shown)
Line 1: Line 1:
{{Maintainer|David Margery}}
{{Maintainer|David Margery}}
{{Portal|User}}
{{Status|Approved}}
{{Status|Approved}}


= You are a Grid'5000 citizen =
== General Principles ==
This charter, approved by the technical comitee, is scheduled for approval by the steering comitee at its next meeting.
Grid'5000 is a scientific instrument supporting experiment-driven research in all areas of computer science, including high performance computing, distributed computing, networking and big data. The use of Grid'5000 should lead to results or contribute to education in those research areas.  


== Principles ==
This charter defines rules to allow the shared use of this infrastructure by different communities of users, with different needs.
The Grid'5000 platform is intended to support research in all areas of computer science related to grid computing.
You should use Grid'5000 in the perspective of '''large scale experiments''' (at least 3 sites and 1000 CPUs).  


You should '''NOT''' use Grid'5000 '''as a production platform'''. Any usage of Grid'5000 should lead to progress in Computer Science. You may generate useful results for other communities, as long as the community of computer science researchers learns something from your experiments.
If your intended usage does not fit within the detailed rules presented below, you request a special permission from the executive committee (resp-sites_gis_g5k@inria.fr). Such exceptions are granted on a regular basis, as can be seen on [[Grid5000:SpecialUsage| the page listing those]].


It is a shared tool, used by many people with different and varying needs. The administrators pursue the following objectives, the main one being the first of this list:
== Resource usages rules ==
# Make the tool available to experiments involving a significant number of nodes (in the 1000's). To make this possible, reservation fragmentation must be avoided as much as possible. Never using more than a single cluster for experiments without any justification will lead to investigations by the steering committee to check whether your experiment fits the objective of Grid'5000
=== Glossary ===
# Keep the platform available for the development of experiments during the day. Therefore, reservations using all the nodes available on one site during work hours (in France) should be avoided.
To request resources on the platform, you can submit '''jobs''' (tasks) using two different modes: submissions and reservations.
# Allow for experiments to run experiments while administrators are available, because platform availability is still greatly dependant on the administrators solving day to day issues.  
* '''Submission''': you submit a job and let the scheduler decide when to run it.  
# Increase the platform's machines usage as much as possible, as long as this usage doesn't interfere with the first three objectives.
* '''Reservation''': you submit a job for a specific date and time in the future.


== Glossary ==
The different sites participating in Grid'5000 do not have clusters of the same size. Therefore, you should take into account job size when you plan an experiment.
You can use the platform with two different modes: submissions and reservations.
* '''Experiment''': An experiment is typically composed of one or more jobs running on Grid'5000's resources.
* '''Submission''': you submit an experiment when you let the scheduler decide when to run it.  
* '''Job size''': the size of a job is defined by the cpu time usable by the job: (nb nodes) * (nb cores) * (job walltime). A 2 hours job using 32 dual-processor dual core nodes has a size of 256 cpu.h.
* '''Reservation''': when you make a reservation, you gain usage of the platform at a specified time. You will then need to launch your experiment interactively.


The different sites participating in Grid'5000 do not have clusters of the same size. Therefore, you should take into account job size when you plan an experiment.
=== Rules ===
* '''Experiment''': An experiment is typically composed of one or more of jobs running on local clusters
# You should not be submitting jobs or making reservations if your '''experiment''' is not '''described''' in your '''[https://api.grid5000.fr/sid/reports/_admin/index.html User report]'''.
* '''Job size''': the size of a job is defined by the cpu time usable by the job: (nb nodes) * (nb procs) * (job walltime). A 2 hours job using 32 bi-processor nodes has a size of 128h.
# Large scale jobs must be executed during nights or week-ends (generally, using reservations to reserve resources in advance). Daytime is dedicated to smaller scale experiments, and preparatory work for large scale experiments. Specifically:
## Between 09:00 and 19:00 (local time of the cluster) during working days (monday to friday), you should not use more than the equivalent of 2 hours on all the cores of the cluster during a given day (e.g. on a 64 bi-processor (dual core) cluster, you should not use more than 2*2**2*64 = 512h between 09:00 and 19:00 CEST).
## It is generally considered rude to have jobs that cross the 09:00 and 19:00 boundaries during week days (to extend an overnight reservation, for example).
## You should not have job that last more than 14h outside week-end.
## You are not allowed to have more than 2 reservations in advance.
# You must '''mention Grid'5000 in all publications''' presenting results or contents obtained or derived from the usage of Grid'5000 and you must either specify Grid'5000 in the ''collaboration'' field if you upload a reference to a [http://hal.archives-ouvertes.fr/ HAL] frontend or update your '''[https://api.grid5000.fr/sid/reports/_admin/index.html User report]''' with a reference to your publication. The '''official acknowledgement''' to use in your publication is the following:


== Good usage rules (All users) ==
<blockquote>
# Please try to plan large scale experiments during night-time or week-ends.
<blockquote>
# Tuesday is the only day where you should feel authorized to use all the machines of a cluster during work hours. On the other days, please leave a few nodes for people developing or preparing an experiment. Between 09:00 and 18:00 (CEST) during the other days, you should not use more than the equivalent of 2 hours on all the processors of the cluster during a given day (e.g. on a 64 bi-processor cluster, you should not use more than 2*2*64 = 256h between 09:00 and 18:00 CEST).
Experiments presented in this paper were carried out using the Grid'5000
# If you plan to use machines on a Tuesday for anything else than a large scale experiment, please wait after 13:00 (CEST) on the preceding Monday to make your reservation.  
testbed, supported by a scientific interest group hosted by
# You should not have more than 2 reservations in advance, because it kills down resource usage. Please optimize and submit jobs instead.
Inria and including CNRS, RENATER and several Universities as well as
# You should manually release reserved resources (kill interactive session, delete job ID, etc.) when you completed your work before the planned end-time.
other organizations (see https://www.grid5000.fr).
# You should not be submitting jobs or making reservations if your '''experiment''' is not '''described in your [[Special:UserReports | User report]]'''.
</blockquote>
# You must '''mention Grid'5000 in all publications''' presenting results or contents obtained or derived from the usage of Grid'5000.
</blockquote>


== Good usage rules (Local users) ==
=== Specific rules for open access users ===
Some sites have opened user accounts on a local branch of LDAP. Users on these local branches only have access to the local cluster. In addition to the above rules, they should:
# Open Access users should not to use OAR's "Advance reservations" more than 24 hours in advance
# not have more than 1 reservation in advance;
# prefer submissions (let oar decide when to run the job) to reservations (specifying a time when the job should run) as much as possible;
# limit the size of their jobs to reasonable proportions such as those locally defined by the local site they have access to (e.g. in Rennes, this would be around 1280h).


== Monitoring ==
== Monitoring ==
Platform usage is actively monitored by Grid'5000 staff. In case of unconform use, your account will be locked. Unconform use includes users whose user report is unsufficiently filled with regard to their resource usage.
Platform usage is actively monitored by the Grid'5000 staff. In case of usage not following the above rules, your account will be locked. The account of users whose user report is insufficiently filled with regard to their resource usage might also be locked.
The monitoring automatically detects usage not matching those rules, with some limitations:
* resources are managed at the site level, but the charter rules are expressed at the cluster level
* current versions of OAR have no concept of charter when scheduling submissions
* monitoring tools automatically remove DEAD nodes from their calculations of cluster size
It is possible that Grid'5000 usage is flagged as not following the charter as a result of events that are urelated to users. This, as well as authorized exceptions, explains why the technical team will not act pro-actively. If incorrect usage is detrimental to you, please mail support-staff@lists.grid5000.fr to start investigations.


== Mailing Lists ==
== Mailing Lists ==
As a Grid'5000 user you are automatically subscribed to the Grid'5000 users mailing list. The traffic is not very high, so please keep an eye on those emails as they may contain important information.
As a Grid'5000 user you are automatically subscribed to the Grid'5000 users' mailing lists. The traffic is not very high, so please keep an eye on those emails as they may contain important information (see [[Mailing lists]] for more information).
 
== Report an Abusive Usage ==
[https://intranet.grid5000.fr/report_abuse/ This form] may be used to report a Grid'5000 usage that is not conform to the Grid'5000 charter and that is preventing you to access resources you need for your work. It will send an email to the offending user and to his managers, asking them to terminate jobs that do not fit within the charter. It will also warn Grid'5000 staff who are instructed to kill these jobs if it is not done by the user. Don't forget to check ongoing [[Grid5000:SpecialUsage|special permissions]].

Revision as of 13:15, 2 September 2015


General Principles

Grid'5000 is a scientific instrument supporting experiment-driven research in all areas of computer science, including high performance computing, distributed computing, networking and big data. The use of Grid'5000 should lead to results or contribute to education in those research areas.

This charter defines rules to allow the shared use of this infrastructure by different communities of users, with different needs.

If your intended usage does not fit within the detailed rules presented below, you request a special permission from the executive committee (resp-sites_gis_g5k@inria.fr). Such exceptions are granted on a regular basis, as can be seen on the page listing those.

Resource usages rules

Glossary

To request resources on the platform, you can submit jobs (tasks) using two different modes: submissions and reservations.

  • Submission: you submit a job and let the scheduler decide when to run it.
  • Reservation: you submit a job for a specific date and time in the future.

The different sites participating in Grid'5000 do not have clusters of the same size. Therefore, you should take into account job size when you plan an experiment.

  • Experiment: An experiment is typically composed of one or more jobs running on Grid'5000's resources.
  • Job size: the size of a job is defined by the cpu time usable by the job: (nb nodes) * (nb cores) * (job walltime). A 2 hours job using 32 dual-processor dual core nodes has a size of 256 cpu.h.

Rules

  1. You should not be submitting jobs or making reservations if your experiment is not described in your User report.
  2. Large scale jobs must be executed during nights or week-ends (generally, using reservations to reserve resources in advance). Daytime is dedicated to smaller scale experiments, and preparatory work for large scale experiments. Specifically:
    1. Between 09:00 and 19:00 (local time of the cluster) during working days (monday to friday), you should not use more than the equivalent of 2 hours on all the cores of the cluster during a given day (e.g. on a 64 bi-processor (dual core) cluster, you should not use more than 2*2**2*64 = 512h between 09:00 and 19:00 CEST).
    2. It is generally considered rude to have jobs that cross the 09:00 and 19:00 boundaries during week days (to extend an overnight reservation, for example).
    3. You should not have job that last more than 14h outside week-end.
    4. You are not allowed to have more than 2 reservations in advance.
  3. You must mention Grid'5000 in all publications presenting results or contents obtained or derived from the usage of Grid'5000 and you must either specify Grid'5000 in the collaboration field if you upload a reference to a HAL frontend or update your User report with a reference to your publication. The official acknowledgement to use in your publication is the following:

Experiments presented in this paper were carried out using the Grid'5000 testbed, supported by a scientific interest group hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations (see https://www.grid5000.fr).

Specific rules for open access users

  1. Open Access users should not to use OAR's "Advance reservations" more than 24 hours in advance

Monitoring

Platform usage is actively monitored by the Grid'5000 staff. In case of usage not following the above rules, your account will be locked. The account of users whose user report is insufficiently filled with regard to their resource usage might also be locked. The monitoring automatically detects usage not matching those rules, with some limitations:

  • resources are managed at the site level, but the charter rules are expressed at the cluster level
  • current versions of OAR have no concept of charter when scheduling submissions
  • monitoring tools automatically remove DEAD nodes from their calculations of cluster size

It is possible that Grid'5000 usage is flagged as not following the charter as a result of events that are urelated to users. This, as well as authorized exceptions, explains why the technical team will not act pro-actively. If incorrect usage is detrimental to you, please mail support-staff@lists.grid5000.fr to start investigations.

Mailing Lists

As a Grid'5000 user you are automatically subscribed to the Grid'5000 users' mailing lists. The traffic is not very high, so please keep an eye on those emails as they may contain important information (see Mailing lists for more information).

Report an Abusive Usage

This form may be used to report a Grid'5000 usage that is not conform to the Grid'5000 charter and that is preventing you to access resources you need for your work. It will send an email to the offending user and to his managers, asking them to terminate jobs that do not fit within the charter. It will also warn Grid'5000 staff who are instructed to kill these jobs if it is not done by the user. Don't forget to check ongoing special permissions.