The increasing availability of 3D building models in digital-city applications has made scalable and accurate 3D building model retrieval essential. However, existing methods often struggle to capture the global structure of building models and to achieve stable retrieval results under viewpoint variations. To address these challenges, we propose a topological-perception-guided feature fusion framework with two complementary fusion schemes for different computational budgets. The topological perception features capture global structure and provide relatively viewpoint-stable information, and they are respectively fused with traditional features and deep features for low- and high-compute-budget scenarios. In addition, the topological perception features guide view selection and view grouping to improve retrieval stability. Experiments show that the traditional-feature fusion scheme improves retrieval accuracy by 8.0–25.7 percentage points, while the deep fusion scheme outperforms Multi-view Convolutional Neural Networks (MVCNNs) and Group-View Convolutional Neural Networks (GVCNNs) by 1.2 and 4.0 percentage points, respectively. These results suggest that incorporating topological perception as guidance for feature fusion strengthens global structural representation and supports viewpoint-invariant retrieval for architecturally complex building models.
Zhang et al. (Mon,) studied this question.