April 1, 2021Open Access

The Role of Syntactic Planning in Compositional Image Captioning

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

Image captioning has focused on generalizing to images drawn from the same distribution as the training set, and not to the more challenging problem of generalizing to different distributions of images. Recently, Nikolaus et al. (2019) introduced a dataset to assess compositional generalization in image captioning, where models are evaluated on their ability to describe images with unseen adjective–noun and noun–verb compositions. In this work, we investigate different methods to improve compositional generalization by planning the syntactic structure of a caption. Our experiments show that jointly modeling tokens and syntactic tags enhances generalization in both RNN- and Transformer-based models, while also improving performance on standard metrics.

Leer artículo completoexternamente

Preguntar a la IA

Me gusta

Guardar

Ver artículo completo

Cite This Study

Bugliarello et al. (Thu,) studied this question.

synapsesocial.com/papers/6a187c2d1ca866914fc9b39a https://doi.org/https://doi.org/10.18653/v1/2021.eacl-main.48

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Preguntar a la IA

Me gusta

Guardar

Ver artículo completo