  • Arnica-0.04: gene prediction (bioinformatics) (Application) [achieved]
  • Protea-0.7: coding gene prediction (bioinformatics) (Application) [in progress]
    Description: We are using large datasets (more than 100000 sequences) of sequences in order to explore coding genes and non-coding gene properties and then build statistical models to recognize them.
  • Arnica-0.07: non-coding RNA prediction (bioinformatics) (Application) [in progress]
    Description: Functions of non-coding RNA genes are defined by their three dimensional structure. Several tools, like caRNAc developped in the team, can predict these structures ab initio. We try to build an energy landscape of the non-coding RNA structures in order to build statistical tests to detect non-coding RNAs. This step requires to compute the energy of non-coding RNAs structures and compares it to the energy of structures obtained on random sequences with the same properties (composition, length, ...).
  • kadeploy 2.1.6: a multicluster kadeploy (Middleware) [achieved]
    Description: tests and implementation of the next released version of kadeploy, that can handle multi clusters sites. The provided URL is a link to the released final version of kadeploy 2.1.6, kadeploy 2.1.5 has been released simultaneously, since it address more general problems without multicluster aspects (interesting only on Grid5000, according to me, since preinstallation and postinstallation scripts are fairly specific in multicluster usages). 2.5 is a milestone half between 2, and more possibilities that should appear in a third version.
      Success stories and benefits from Grid'5000

      • Overall benefits
      • Grid'5000 brings our work on! We are trying to build strong statistical models to predict coding/non-coding genes. This task requires a lots of CPU time and hard disk space to compute and store calculus results.

