Grid'5000 user report for Stephane Ayache
User informationStephane Ayache (users, user, sophia, irim, ml-users user)
More user information in the user management interface.
- Concept Indexing in Images and Videos (Application) [in progress]
Description: In this experiment we want to study methods for concept indexing in images and videos. The goal is to train image or video shot classifiers using supervised learning and to evaluate them using annotated data. Computations involve computing descriptors, training classifiers and performing fusion. The target scale is in millions of images or video shots and hundreds or thousands of concepts. Several ``runs'' will have to be conducted for identifying the best combination for descriptors, classifiers and fusion methods and for tuning the associated global parameters.
This work is related to the IRIM project supported by GDR ISIS and by the APIMS MSTIC project supported by UJF. In 2011, the following IRIM members participated to the semantic indexing task of the TRECVID 2011 evaluation campaign:
- LIG/MRIM: Georges Quénot, Bahjat Safadi, Yubing Tong, Franck Thollard,
- ETIS/LIP6: Philippe Gosselin/Matthieu Cord,
- IRIT: Hervé Bredin, Lionel Koenig,
- LaBRI: Boris Mansencal, Hugo Boujut, Jenny Benois-Pineau,
- LIF: Stephane Ayache,
- LISTIC: Patrick Lambert,
- GIPSA: Denis Pellerin, Lional Granjon.
The TRECVID 2011 semantic indexing (SIN) task aims at annotating 346 visual or multi-modal concepts in about 146K video shots. A development set of about 266K video shots partially annotated with the 346 concepts is given for system training and tuning.
A common data organization and exchange format has been set up. These partners provided audio and image features, 44 in total. These features were evaluated individually using three different types of classifiers: "classical" SVMs, multi-learner SVMs and KNNs. Global classification systems were built by fusing the output of all the classifier types applied to all the descriptor types. This was done using a hierarchical fusion approach.
Experiments performed showed that fusing the output of several descriptors significantly improve the performance over implementations using a single descriptor. Moreover, we also demonstrated that fusing the results obtained with several variants of a same descriptor (changing the number of bins in a histogram for instance) also improves the performance. Similarly, fusing the output of different classifiers applied to a same descriptor also improve the performance over the one of the best classifier. Finally, global systems including all these types of fusion were developed for the IRIM submissions at the TRECVID evaluation. The IRIM group obtained the third place in a total of 19 partictpating groups worldwide.GRID 5000 was essential as a support for this collaboration. It has been used by the IRIM members to exchange data, to compute and share the descriptors, to compute and share the individual classification results and finally to perform the fusion experiments. Hundreds of nodes were used over several weeks in order to test the various combinations for the fusion and identify the best ones. This would not have been possible or in a very limited way without the GRID'5000 resources.
09 of August 2010, submission of runs of the irim project to the TRECVID competition.
28 of August 2010, submission of runs to the PASCAL VOC challenge (large scale task).
- Semantic Multimedia Indexing (Application) [in progress]
Description: Doing experiments on semantic indexing for multimedia documents, with application to multimedia data mining and/or multimedia information retrieval. The main difficulty for reaching a reasonable performance for concept detection in multimedia documents comes from the "semantic gap" between the raw multimedia contents and the elements that make sense to human beings. Using GRID'5000 architecture, we will try to deploy extensive learning/classification algorithms. We also want to better optimize the fusion of numerous document descriptions.