June 1, 2010

Topic regression multi-modal Latent Dirichlet Allocation for image annotation

Key Points

Key points are not available for this paper at this time.

Abstract

We present topic-regression multi-modal Latent Dirich-let Allocation (tr-mmLDA), a novel statistical topic model for the task of image and video annotation. At the heart of our new annotation model lies a novel latent variable regression approach to capture correlations between image or video features and annotation texts. Instead of sharing a set of latent topics between the 2 data modalities as in the formulation of correspondence LDA in 2, our approach introduces a regression module to correlate the 2 sets of topics, which captures more general forms of association and allows the number of topics in the 2 data modalities to be different. We demonstrate the power of tr-mmLDA on 2 standard annotation datasets: a 5000-image subset of COREL and a 2687-image LabelMe dataset. The proposed association model shows improved performance over correspondence LDA as measured by caption perplexity.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Duangmanee Putthividhy

University of California, San Diego

Hagai Attias

Gatsby Charitable Foundation

Srikantan S. Nagarajan

University of California, San Francisco

Actions

Institutions

University of California, San Diego

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Topic regression multi-modal Latent Dirichlet Allocation for image annotation

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study