With advancements in Earth observation capabilities, the demand for large-scale mapping using remote sensing images has increased significantly. However, selecting an optimal image set for the area of interest (AOI) from a large collection of remote sensing images remains challenging. On the one hand, it is crucial to select images with minimal redundancy and low cloud cover to enhance production efficiency and the effective coverage of mapping products. On the other hand, adjacent selected images should transition naturally so that the resulting mapping products appear visually cohesive. Unfortunately, most existing remote sensing image selection algorithms focus only on the former, with little attention to visual consistency. Meanwhile, images from the same swath inherently offer advantages in both redundancy reduction and visual consistency. However, a larger coverage area also carries the potential for greater variation in cloud cover, and cloud distribution within a swath can be highly complex. Managing the relationships among swaths, images, and cloud cover is also challenging. To address these issues, this paper proposes a novel image selection model, SwathSel. Candidate images are grouped through a composite grouping strategy based on swaths, cloud cover, and topological connectivity, thereby expanding the fundamental unit for image selection from individual scenes to connected image subsets. A dynamic adjustment mechanism is introduced to enhance grouping flexibility. Additionally, local and global swath consistency constraints are designed to strengthen visual consistency among images, and a subset evaluation module is used to comprehensively assess swath consistency, coverage, cloud cover, and metadata information. Through a greedy strategy combined with a rapid refinement technique, the final selected image set is obtained. Experiments were conducted on four datasets, and four quantitative metrics were designed to evaluate the visual consistency of the results. Compared with baseline models, SwathSel achieves lower redundancy and cloud cover while delivering superior visual consistency.
Zhang et al. (Fri,) studied this question.