Key points are not available for this paper at this time.
In recent years, the problem of associating a sentence with an image has gained a lot of attention. This work continues to push the envelope and makes further progress in the performance of image annotation and image search by a sentence tasks. In this work, we are using the Fisher Vector as a sentence representation by pooling the word2vec embedding of each word in the sentence. The Fisher Vector is typically taken as the gradients of the log-likelihood of descriptors, with respect to the parameters of a Gaussian Mixture Model (GMM). In this work we present two other Mixture Models and derive their Expectation-Maximization and Fisher Vector expressions. The first is a Laplacian Mixture Model (LMM), which is based on the Laplacian distribution. The second Mixture Model presented is a Hybrid Gaussian-Laplacian Mixture Model (HGLMM) which is based on a weighted geometric mean of the Gaussian and Laplacian distribution. Finally, by using the new Fisher Vectors derived from HGLMMs to represent sentences, we achieve state-of-the-art results for both the image annotation and the image search by a sentence tasks on four benchmarks: Pascal1K, Flickr8K, Flickr30K, and COCO.
Building similarity graph...
Analyzing shared references across papers
Loading...
Benjamin Klein
Novartis Foundation
Guy Lev
University of Colorado Anschutz Medical Campus
Gil Sadeh
Amazon (United States)
Tel Aviv University
Building similarity graph...
Analyzing shared references across papers
Loading...
Klein et al. (Mon,) studied this question.
synapsesocial.com/papers/6a1bbff700ee29383e9cd154 — DOI: https://doi.org/10.1109/cvpr.2015.7299073