The increasing frequency and intensity of wildfires pose serious threats to ecosystems, property, and human safety worldwide. Accurate semantic segmentation of wildfire images is essential for real-time fire monitoring, spread prediction, and disaster response. However, existing deep learning methods heavily rely on large volumes of pixel-level annotated data, which are difficult and costly to obtain in real-world wildfire scenarios due to complex environments and urgent time constraints. To address this challenge, we propose a semi-supervised wildfire image segmentation framework that enhances segmentation performance under limited annotation conditions by integrating multi-scale structural information fusion and pixel-level contrastive consistency learning. Specifically, a Lagrange Interpolation Module (LIM) is designed to construct structured interpolation representations between multi-scale feature maps during the decoding stage, enabling effective fusion of spatial details and semantic information, and improving the model’s ability to capture flame boundaries and complex textures. Meanwhile, a Pixel Contrast Consistency (PCC) mechanism is introduced to establish pixel-level semantic constraints between CutMix and Flip augmented views, guiding the model to learn consistent intra-class and discriminative inter-class feature representations, thereby reducing the reliance on large labeled datasets. Extensive experiments on two public wildfire image datasets, Flame and D-Fire, demonstrate that our method consistently outperforms other approaches under various annotation ratios. For example, with only half of the labeled data, our model achieves 5.0% and 6.4% mIoU improvements on the Flame and D-Fire datasets, respectively, compared to the baseline. This work provides technical support for efficient wildfire perception and response in practical applications.
Sun et al. (Thu,) studied this question.