Key points are not available for this paper at this time.
For the overwhelming amounts of multimedia used on the Web, methods of search and understanding with sentences are necessary. Representing the contents not only using labels but also using sentences including labels' relations enables users to search with a story and to understand multimedia deeply. However, few existing works describe such sentences because obtaining objects' relations and grammar is difficult. We specifically examine captions of images that are similar to an input image. They are expected to explain the input image to some degree. Therefore, we propose a novel approach to generate a sentential caption for the input image by summarizing those captions. Our experiment using a dataset consisting of images and text demonstrates that the proposed method can generate sentential captions.
Ushiku et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: