July 3, 2023Open Access

Fast RF-UIC: A fast unsupervised image captioning model

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

For visually impaired individuals, image captioning is a crucial task that utilizes deep learning models to recognize an image and generate a descriptive sentence, enabling them to understand the content of the image through words. However, the existing image captioning model needs a lot of manual annotation. Fortunately, the emergence of unsupervised methods provides a new approach to image captioning. Our proposed model Fast RF-UIC achieves unsupervised functionality through the designed Pre-trainer. Compared with the existing pre-trained model, the Pre-trainer has a faster and shorter training cycle. The R2-Inception-V4 model is designed as an encoder that fuse the Res2Net structure with Inception-V4 to obtain more image features. Bi-FGRU is designed as the decoder, which the FReLU activation function is used to improve the character representation ability from two-dimensional space. Furthermore, we expanded the corpus used in existing unsupervised image captioning and included additional captions for common objects, effectively enhancing the model’s generalization ability. Through experiments, Fast RF-UIC achieved higher scores than existing unsupervised image captioning methods on several text evaluation metrics such as BLUE, ROUGE, and CIDEr.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Rui Yang

Tianjin University

Xiayu Cui

Qinzhi Qin

Journals

Displays

Actions

Institutions

Guilin University of Electronic Technology

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Fast RF-UIC: A fast unsupervised image captioning model

Puntos clave

Resumen

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study