API Authentication and Authorisation Motivations

From Grid5000
Jump to: navigation, search


Introduction

We are building a set of APIs for Grid'5000, a distributed system. Use-cases for this API include

  1. Usage of Grid'5000 from the user's workstation (see GRUDU)
  2. Usage for dynamic updates of parts of www.grid5000.fr, hardware description in particular
  3. Web interface for performing experiments (needed for demos)
  4. Management of a set of experiments from the users workstation, a frontend, or virtual machines on a possible private Grid'5000 cloud, including submissions and deployments.
  5. Usage by the script performing the experiment to collect data, activate some probes or other functionality or deploy nodes
  6. Grid-level scripts for performing experiments
  7. Interconnection with other platforms (through a server bridging differences)
  8. Update of account information by the user or admins (see User Management Service)

This means that the architecture for hosting the different APIs must manage

  1. Anonymous read-only accesses (hardware description for the web site, monitoring, metrology informations) from servers
  2. Authenticated accesses (oar submissions, kadeploy3 deployments, Kvlan activation...)
  3. Authorized accesses (update of the hardware description, or user profiles for account management). This concerns a minority of scenarios at the time of writing.

The general architecture for service deployment can provide this either by

  • doing nothing : each service developer must invent his own scheme to authenticate and authorize users, according to the needs of the service. This is obviously less than ideal
  • providing a library implementing these services when needed. This imposes a little discipline to request the information before passing it to the library, and a specific programming language for service developers (there is no way we can maintain at the same level the same library in 2 different languages)
  • Use the HTTP and deployment architecture to provide Authentication and Authorization when needed. This is the path we have chosen.

Authentication and Authorization for HTTP

Because our service architecture relies on REST and therefore HTTP, the ideal situation for implementing authentication and authorization is to use technologies supported by most web servers and proxies. We could of course develop our own module for Apache to perfectly suit our needs, but this isn't a very maintainable approach. This basically leaves us with a choice of one or more of the following technologies

  • Ident: the server requests authentication from the client, which must be trusted. Authorization is not possible unless the services make another call to a specific service.
    • Main advantage: for users running their code on trusted clients, it is completely transparent for the code and for the user.
    • Main drawbacks:
      • authentication costs an additional roundtrip from the server to the client, for each request.
      • In scenarios 1, 4, 5, it requires ssh tunnels by the user to a trusted client, possibly adding an additional delay
      • It doesn't cover scenarios 2, 3, 7, 8
      • Implies a web of trust between servers if implementing a service relies on other services that need authentication. This implies listening on 2 ports : one for requests with trusted information, one for request where authentication has not been done.
      • It doesn't handle authorization
  • Basic Auth: the HTTP header must carry login/passwd information
    • Main advantages
      • Users understand the login/passwd mechanism very well
      • Quite natural for interactions with the platform through a web browser
    • Main drawbacks
      • login/passwd information must be provided to all scripts, thus increasing the risk that this information is stored in a inappropriate manner (in the code repository if embedded in the code, as publicly visible information if given on the command line, in a well known file in the user's home directory or Grid'5000 home directory).
        • If the login/passwd is the main Grid'5000 login/passwd, we are at great risk
        • If it is specific for accessing the API, most users will use the same credentials and explaining the difference will become a support nightmare
      • Relies on a call to an external authentication service, that can be implemented on the same machine (ldap slave for example)
      • web of trust between servers if the call to the external authentication mechanism is to be optimized out after the first validation. See corresponding remark for ident.
      • Simple authorization based on an specific ldap attribute, can be handled
  • Usage of certificates: a certificate, containing authentication and authorization is sent with each request.
    • Main advantages
      • Authentication and authorization is decoupled from usage of services from the API : they are done beforehand, when the certificate is issued. Thus performance in the response time is better than basic auth and ident methods.
      • requests from users and from other servers can be handled in the same way
      • It is the only technology that supports scenario 8
    • Main drawbacks
      • Need to maintain a reliable certification infrastructure, including revocation procedures and certificate issuing.
      • Certificates are seen as complicated by users, even if the complexity is not very different than the one related to ssh keys.
      • Cost of first usage of an application using the API can seem expensive.

Design decisions and open choices

A first design decision is made here: the API hosting architecture will provide authentication. Authorization is also a desired feature, but it is not yet clear that enough APIs need Authorization to justify handling this at the hosting architecture level.

The best way of providing authentication at the API hosting architecture level is to rely on schemes supported by the web server layer, ie http. This has the benefit of using schemes very likely supported by http client software and libraries.

The current proposition reviewed for implementation can be found on a dedicated page.

Personal tools
Namespaces

Variants
Actions
Public Portal
Users Portal
Admin portal
Wiki special pages
Toolbox