September 16, 2015Open Access

Guiding Long-Short Term Memory for Image Caption Generation

Key Points

Key points are not available for this paper at this time.

Abstract

In this work we focus on the problem of image caption generation. We propose an extension of the long short term memory (LSTM) model, which we coin gLSTM for short. In particular, we add semantic information extracted from the image as extra input to each unit of the LSTM block, with the aim of guiding the model towards solutions that are more tightly coupled to the image content. Additionally, we explore different length normalization strategies for beam search in order to prevent from favoring short sentences. On various benchmark datasets such as Flickr8K, Flickr30K and MS COCO, we obtain results that are on par with or even outperform the current state-of-the-art.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Xu Jia

Shanghai University of Engineering Science

Efstratios Gavves

Amsterdam University of Applied Sciences

Basura Fernando

Agency for Science, Technology and Research

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Guiding Long-Short Term Memory for Image Caption Generation

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study