What question did this study set out to answer?

The aim is to create an automatic image captioning system for remote sensing images to support environmental monitoring.

March 12, 2026Open Access

A Hybrid deep learning framework for capturing environmental change through image captioning

Puntos clave

The aim is to create an automatic image captioning system for remote sensing images to support environmental monitoring.
Developed a hybrid framework combining visual features from VGG-16 and semantic representations from Word2Vec.
Utilized an attention-enhanced Long Short-Term Memory (LSTM) network for decoding and generating captions.
Applied the framework to UC Merced Land Use (UCM) and RSICD datasets to evaluate environmental categories.
Produced high-accuracy captions describing land use and ecological conditions.
Demonstrated superior performance against existing methods using evaluation metrics like BLEU and METEOR.
Captured critical environmental attributes relevant for climate change and sustainable planning.

Resumen

Remote sensing plays a central role in monitoring Earth’s surface for environmental changes such as deforestation, urban expansion, water scarcity, and climate-induced disasters. However, the rapid increase in satellite image acquisition makes manual interpretation impractical. This study proposes a hybrid deep learning framework that automatically generates descriptive captions for remote sensing images, enabling environmental scientists to interpret large-scale Earth observation data efficiently. The framework integrates visual features extracted with a fine-tuned VGG-16 network and semantic representations learned through Word2Vec embeddings, which are fused and decoded via an attention-enhanced Long Short-Term Memory (LSTM) network. Applied to the UC Merced Land Use (UCM) and RSICD datasets, which cover diverse environmental categories, the model produces captions that describe land use and ecological conditions with high accuracy. Evaluation using BLEU, METEOR, ROUGE, and CIDEr metrics demonstrates superior performance compared to existing approaches. More importantly, the generated captions capture meaningful environmental attributes–such as vegetation loss, settlement growth, or presence of water bodies–that are critical for applications in climate change monitoring, disaster management, and sustainable land-use planning. This approach provides a pathway for large-scale, automated environmental assessments, supporting decision-making in Earth system science and policy.

Preguntar a la IA

Me gusta

Guardar

Ver artículo completo