August 16, 2025Open Access

A Semi-Supervised Wildfire Image Segmentation Network with Multi-Scale Structural Fusion and Pixel-Level Contrastive Consistency

Puntos clave

The proposed model improves segmentation performance under limited annotation conditions, enhancing wildfire monitoring.
With just half of the labeled data, improvements of 5.0% and 6.4% mIoU were achieved on the Flame and D-Fire datasets, respectively.
This model incorporates a Lagrange Interpolation Module to effectively fuse multi-scale feature maps for better details.
The approach emphasizes pixel-level consistency learning, reducing dependency on large labeled datasets for effective training.

Resumen

The increasing frequency and intensity of wildfires pose serious threats to ecosystems, property, and human safety worldwide. Accurate semantic segmentation of wildfire images is essential for real-time fire monitoring, spread prediction, and disaster response. However, existing deep learning methods heavily rely on large volumes of pixel-level annotated data, which are difficult and costly to obtain in real-world wildfire scenarios due to complex environments and urgent time constraints. To address this challenge, we propose a semi-supervised wildfire image segmentation framework that enhances segmentation performance under limited annotation conditions by integrating multi-scale structural information fusion and pixel-level contrastive consistency learning. Specifically, a Lagrange Interpolation Module (LIM) is designed to construct structured interpolation representations between multi-scale feature maps during the decoding stage, enabling effective fusion of spatial details and semantic information, and improving the model’s ability to capture flame boundaries and complex textures. Meanwhile, a Pixel Contrast Consistency (PCC) mechanism is introduced to establish pixel-level semantic constraints between CutMix and Flip augmented views, guiding the model to learn consistent intra-class and discriminative inter-class feature representations, thereby reducing the reliance on large labeled datasets. Extensive experiments on two public wildfire image datasets, Flame and D-Fire, demonstrate that our method consistently outperforms other approaches under various annotation ratios. For example, with only half of the labeled data, our model achieves 5.0% and 6.4% mIoU improvements on the Flame and D-Fire datasets, respectively, compared to the baseline. This work provides technical support for efficient wildfire perception and response in practical applications.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo