What question did this study set out to answer?

This research aims to enhance road extraction accuracy from remote sensing images using a weakly supervised approach.

May 31, 2026Open Access

GeoRoad-UPerNet: Geo-1-Based Weakly Supervised Multispectral Road Extraction via Role-Aware Context Fusion and Semantic Regularization

Key Points

This research aims to enhance road extraction accuracy from remote sensing images using a weakly supervised approach.
Developed GeoRoad-UPerNet incorporating Geo-1 multispectral imagery and Sentinel-2 context.
Utilized OpenStreetMap data for proxy supervision instead of manual ground truth.
Implemented three modules: GSSS, GAGF, and RSMH for improved multisource integration.
Achieved an IoU of 0.7204 and an F1-score of 0.8375 on benchmark tests.
IoU, F1-score, and Precision improved by 6.29%, 3.65%, and 12.58% versus baseline.
Reduced boundary noise and background false positives in road extraction.

Abstract

Extracting roads accurately from remote sensing images is important for map updates, traffic analysis, and infrastructure monitoring. Medium-resolution multispectral images can provide useful surface and background information, but when used alone, the spatial details are limited for retaining narrow roads, intersection structures, and fine road topologies. To address this problem, this paper proposes GeoRoad-UPerNet, a Geo-1-centered weakly supervised multispectral framework for road extraction. In this framework, Geo-1 serves as the primary 16-band multispectral source, Sentinel-2 Level-2A imagery serves as auxiliary contextual support, and OpenStreetMap (OSM) road information is converted into proxy supervision rather than dense manual ground truth. GeoRoad-UPerNet contains three modules: a Geo Spectral Semantic Stem (GSSS), a Geo-Auxiliary Gated Fusion module (GAGF), and a Road Semantic Multi-Task Head (RSMH). GSSS strengthens road-sensitive multispectral responses in the Geo-1 branch. GAGF injects Sentinel-2 context through a Geo-centered gate instead of symmetric channel concatenation. RSMH imposes restrained hierarchy- and material-aware semantic regularization on the shared decoder representation during training. On the fixed source-domain benchmark, the complete model achieves an IoU of 0.7204, an F1-score of 0.8375, a Precision of 0.8092, and a Recall of 0.8678 against OSM-derived proxy masks. Relative to the UPerNet-MiT-B3 early-fusion baseline, IoU, F1-score, and Precision increase by 6.29%, 3.65%, and 12.58%, respectively. These results indicate that role-aware multisource organization improves road extraction under proxy supervision and reduces boundary noise and background false positives.

GeoRoad-UPerNet: Geo-1-Based Weakly Supervised Multispectral Road Extraction via Role-Aware Context Fusion and Semantic Regularization

Key Points

Abstract

Cite This Study