Key points are not available for this paper at this time.
Image caption is becoming important in the field of artificial intelligence. Most existing methods based on CNN-RNN framework suffer from the problems of object missing and misprediction due to the mere use of global representation at image-level. To address these problems, in this paper, we propose a global-local attention (GLA) method by integrating local representation at object-level with global representation at image-level through attention mechanism. Thus, our proposed method can pay more attention to how to predict the salient objects more precisely with high recall while keeping context information at image-level cocurrently. Therefore, our proposed GLA method can generate more relevant sentences, and achieve the state-of-the-art performance on the well-known Microsoft COCO caption dataset with several popular metrics.
Building similarity graph...
Analyzing shared references across papers
Loading...
Ling‐Hui Li
Chinese Academy of Medical Sciences & Peking Union Medical College
Sheng Tang
Chinese Academy of Sciences
Lixi Deng
Hunan Normal University
Chinese Academy of Sciences
The University of Texas at San Antonio
Building similarity graph...
Analyzing shared references across papers
Loading...
Li et al. (Sun,) studied this question.
synapsesocial.com/papers/6a0ef57c53f874f2b2230005 — DOI: https://doi.org/10.1609/aaai.v31i1.11236
Synapse has enriched 3 closely related papers on similar clinical questions. Consider them for comparative context: