Cognitive-Level Annotation using Latent Statistical Structure (EU ICT FP6)


This project is a collaboration of six leading European research teams in visual recognition, text understanding and machine learning. It aims at developing advanced machine learning algorithms that automatically discover people, objects and other scene elements that are present in images, video and associated text, and use them to structure and interpret scenes. Discovery occurs at three levels of abstraction: new individuals (specific people, objects, scenes and actions), new object classes and attributes, and new hierarchical and other relations between entities. Important foci of the HMDB-LIIR research are the development of machine learning algoithms for analyzing texts that require a minimum of supervision, and of algorithms for the multimodal processing of text and images, among which are probabilistic topic models.


The partners in the CLASS project are: K.U.Leuven (ESAT-Visics, Prof. Luc Van Gool and Prof. Tinne Tuytelaars), LEAR, France (Dr. Bill Triggs), INRIA- Grenoble, France (Dr. Cordelia Smid), University of Oxford, UK (Prof. Andrew Zisserman), University of Helsinki, Finland (Prof. Wray Buntine and Prof. Petri Myllymaki), and Max-Planck Institute for Biological Cybernetics, Germany (Prof. Bernard Schölkopf) .


Several techniques for word sense disambiguation and for the detection of visual entities and their visual atrtibutes in text (e.g., relying on association techniques from data mining, on metrics for semantic similarity applied on WordNet) were studied. Moreover, we have designed, implemented and tested different probabilistic models for the alignment of names and faces in news texts and their accompanying images. We have investigated discriminative and generative models for recognizing the semantic frames and roles in English sentences with special attention to semi-supervised models. This research gave rise to the "Latent Words Language Model". We built a proof-of-concept demonstrator that interrogates images with persons pictured and a demonstrator that automatically commentates broadcasted soaps. We have also designed, implemented and evaluated a tool for multimodal segmentation of video news. A multimodal news summarizer was developed.

CLASS has been identified by the European Commission as an "excellent project".

Period From 2006-01-01 to 2009-06-30.
Financed by EU Sixth Framework Programme ICT, EU FP6-027978
Supervised by Marie-Francine Moens
Staff Wim De Smet
Koen Deschacht
Phi The Pham
Gert-Jan Poulisse
Contact Koen Deschacht

More information can be found on the project website


  1. DESCHACHT, Koen and MOENS, Marie-Francine Efficient Hierarchical Entity Classification Using Conditional Random Fields, Proceedings of the 2nd Workshop on Ontology Learning and Population, Sydney, 22 July. 2006
  2. DESCHACHT, Koen and MOENS, Marie-Francine Efficient Hierarchical Entity Classification Using Conditional Random Fields, 18th Belgian-Dutch Conference on Artificial Intelligence, Abstract of the OLP2 paper, Namur, Belgium, October 5-6. 2006
  3. DESCHACHT, Koen, MOENS, Marie-Francine and ROBEYNS, Wouter. Cross-Media Entity Recognition in Nearly Parallel Visual and Textual Documents, in Proceedings of the 8th RIAO conference on Large-Scale Semantic Access to Content (Text, Image, Video and Sound), Pittsburgh, USA, May 30-June 1 . 2007
  4. DESCHACHT, Koen and MOENS, Marie-Francine. Text Analysis for Automatic Image Annotation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, June 23th-30th, 2007 . 2007
  5. MOENS, Marie-Francine. Words and Pictures: The Power of Information Extraction (paper of keynote lecture). In Proceedings of the 29th International Conference on Information Technology Interfaces. IEEE Computer Society. 2007
  6. VOSSEN, Piek and HOFMANN, Katja and DE RIJKE, Maarten and TJONG, Erik and SANG, Kim and DESCHACHT, Koen The Cornetto Database: Architecture and User-Scenarios, Proceedings of the 7th Dutch-Belgian Information Retrieval Workshop (DIR 2007), Leuven, Belgium, March 28-29. 2007
  7. Koen DESCHACHT and Marie-Francine MOENS Text Analysis for Automatic Image Annotation, In Proceedings of the 19th Belgian-Dutch Conference on Artificial Intelligence (Dastani, M. and de Jong, E., eds.), pp. 260-267, 2007, Utrecht, The Netherlands 2007
  8. DESCHACHT, Koen & MOENS Marie-Francine Finding the Best Picture: Cross-Media Retrieval of Content. In C. Macdonald, I.Ounis, V. Plachouras & I. Ruthven (Eds.) Proceedings of the 30th European Conference on Information Retrieval. Lecture Notes in Computer Science 4956 (pp.539-546). Springer. 2008
  9. BOIY, Erik, DESCHACHT, Koen & MOENS Marie-Francine Learning Visual Entities and their Visual Attributes from Text Corpora In Proceedings of the 5th International Workshop on Text-based Information Retrieval . IEEE Computer Society Press. 2008
  10. PHAM, Phi The, MOENS, Marie-Francine & TUYTELAARS, Tinne Linking names and faces: Seeing the problem in different ways. In Proceedings of ECCV Workshop on Faces in Real-Life Images: Detection, Alignment and Recognition. 2008
  11. POULISSE, Gert-Jan and MOENS, Marie-Francine "Multimodal News Story Segmentation". In Proceedings of the First International Conference on Intelligent Human Computer Interaction (IHIC 2009) (pp. 95-101). New York: Springer. 2009
  12. DE SMET, Wim & MOENS, Marie-Francine An Aspect Based Document Representation for Event Detection. In Proceedings of the 19th Meeting of Computational Linguistics in The Netherlands. 2009
  13. POULISSE, Gert-Jan, MOENS, Marie-Francine & DEKENS, Tomas News Story Segmentation In Multiple Modalities. In Proceedings of the 7th International Workshop on Content-Based Multimedia Indexing (CBMI 2009). IEEE Computer Society. 2009
  14. DESCHACHT, Koen & MOENS Marie-Francine The Latent Words Language Model. In Proceedings of the 18th Annual Belgian-Dutch Conference on Machine Learning (Benelearn 09). 2009
  15. DESCHACHT, Koen & MOENS Marie-Francine Using the Latent Words Language Model for Semi-Supervised Semantic Role Labeling. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP 2009). ACL. 2009
  16. MOENS, M.-F., FERRARI, V., TUYTELAARS, T. & VAN GOOL (Eds.) Proceedings of IJCAI-09 Workshop on Cross-media Information Access and Mining (CIAM 2009). AAAI. 2009
  17. DE Smet, W. & MOENS, M.-F. An Aspect Based Document Representation for Event Clustering. In Proceedings of CLIN 19 . University of Groningen. 2009
  18. DE SMET, Wim & MOENS, Marie-Francine Cross-Language Linking of News Stories on the Web Using Interlingual Topic Models. In Proceedings of the CIKM Workshop on Social Web Search and Mining ( SWSM 2009). New York: ACM. 2009
  19. PHAM, Phi The, MOENS, Marie-Francine & TUYTELAARS, Tinne Cross-Media Alignment of Names and Faces, IEEE Transactions on Multimedia, 12 (1), 13-27. 2010
  20. POULISSE, Gert-Jan, MOENS, Marie-Francine, DEKENS, Tomas & DESCHACHT, Koen News Story Segmentation in Multiple Modalities, Multimedia Tools and Applications (in press). 2010
  21. PHAM, Phi The, MOENS, Marie-Francine & TUYTELAARS, Tinne Cross-Media Alignment of Names and Faces. In Proceedings of DIR 2010 10th Dutch-Belgian Information Retrieval Workshop (pp. 88-89). Radboud Universiteit Nijmegen. 2010
  22. PHAM, Phi The, MOENS, Marie-Francine & TUYTELAARS, Tinne Naming Persons in News Video. In Proceedings of the International Workshop on Visual Content Identification and Search (VCIDS 2010) - IEEE International Conference on Multimedia & Expo (ICME 2010). 2010
  23. ENGELS, Chris, DESCHACHT, Koen, BECKER, Jan Hendrik, TUYTELAARS, Tinne, VAN GOOL, Luc & MOENS, Marie-Fracine Automatic Annotation of Unique Locations from Video and Text. In Proceedings BMVC 2010 . 2010
  24. DE SMET, Wim & MOENS, Marie-Francine Representations for Multi-Document Event Clustering. Data Mining and Knowledge Discovery, 26(3): 533-558. 2013
  25. HAESEN, Mieke, MESKENS, Jan, PHAM, Phi T., POULISSE, Gert-Jan, BECKER,Jan H., CONINX, Karin, LUYTEN, Kris, TUYTELAARS, Tinne & MOENS, Marie-Franciine Finding a Needle in a Haystack: An Interactive Video Archive Explorer for Professional Video Searchers. Multimedia Media Tools and Applications, 63 (2), 331-356. 2013

Back to all projects