Concept Indexing in Images and Videos (Application)
Conducted byStephane Ayache, Georges Quénot, Bahjat SAFADI, Franck Thollard
DescriptionIn this experiment we want to study methods for concept indexing in images and videos. The goal is to train image or video shot classifiers using supervised learning and to evaluate them using annotated data. Computations involve computing descriptors, training classifiers and performing fusion. The target scale is in millions of images or video shots and hundreds or thousands of concepts. Several ``runs'' will have to be conducted for identifying the best combination for descriptors, classifiers and fusion methods and for tuning the associated global parameters.
This work is related to the IRIM project supported by GDR ISIS and by the APIMS MSTIC project supported by UJF. In 2011, the following IRIM members participated to the semantic indexing task of the TRECVID 2011 evaluation campaign:
- LIG/MRIM: Georges Quénot, Bahjat Safadi, Yubing Tong, Franck Thollard,
- ETIS/LIP6: Philippe Gosselin/Matthieu Cord,
- IRIT: Hervé Bredin, Lionel Koenig,
- LaBRI: Boris Mansencal, Hugo Boujut, Jenny Benois-Pineau,
- LIF: Stephane Ayache,
- LISTIC: Patrick Lambert,
- GIPSA: Denis Pellerin, Lional Granjon.
The TRECVID 2011 semantic indexing (SIN) task aims at annotating 346 visual or multi-modal concepts in about 146K video shots. A development set of about 266K video shots partially annotated with the 346 concepts is given for system training and tuning.
A common data organization and exchange format has been set up. These partners provided audio and image features, 44 in total. These features were evaluated individually using three different types of classifiers: "classical" SVMs, multi-learner SVMs and KNNs. Global classification systems were built by fusing the output of all the classifier types applied to all the descriptor types. This was done using a hierarchical fusion approach.
Experiments performed showed that fusing the output of several descriptors significantly improve the performance over implementations using a single descriptor. Moreover, we also demonstrated that fusing the results obtained with several variants of a same descriptor (changing the number of bins in a histogram for instance) also improves the performance. Similarly, fusing the output of different classifiers applied to a same descriptor also improve the performance over the one of the best classifier. Finally, global systems including all these types of fusion were developed for the IRIM submissions at the TRECVID evaluation. The IRIM group obtained the third place in a total of 19 partictpating groups worldwide.GRID 5000 was essential as a support for this collaboration. It has been used by the IRIM members to exchange data, to compute and share the descriptors, to compute and share the individual classification results and finally to perform the fusion experiments. Hundreds of nodes were used over several weeks in order to test the various combinations for the fusion and identify the best ones. This would not have been possible or in a very limited way without the GRID'5000 resources.
- Nodes involved: 1000
- Sites involved: >3
- Minimum walltime: 1h
- Batch mode: yes
- Use kadeploy: no
- CPU bound: no
- Memory bound: yes
- Storage bound: yes
- Network bound: no
- Interlink bound: no
Tools usedCustom software.
09 of August 2010, submission of runs of the irim project to the TRECVID competition.
28 of August 2010, submission of runs to the PASCAL VOC challenge (large scale task).
Shared by: Stephane Ayache, Georges Quénot, Bahjat SAFADI, Franck Thollard
Last update: 2012-04-30 14:06:37