Campus streetscapes are a key part of universities’ everyday public realm, yet the same scene may be perceived positively in one dimension while negatively in another. To diagnose such multi-dimensional perceptual differences and translate them into actionable design evidence, this study develops an interpretable vision analytics framework for adaptive campus design. Using 72,733 Baidu Street View images collected from 41 campuses in mainland China, the study integrates ResNet-50-based perception prediction, spatial element extraction, XGBoost–SHAP-based mechanism interpretation, Kruskal–Wallis H testing, and GIS-based scene mapping. Supported by supplementary in situ validation, six types of multi-dimensional perceptual differences were identified. Sky, buildings, vegetation, hardscape, and terrain were found to be the five most important spatial elements overall, among which sky, buildings, and vegetation repeatedly emerged as the dominant core elements distinguishing different perceptual types. These elements do not act independently or linearly, but jointly shape different types of multi-dimensional perceptual differences through nonlinear threshold effects and interactions. These perceptual difference types were further found to cluster in recognizable campus scenes, including main roads, plazas, lawns, forest belts, and lakeside spaces. Based on these findings, scene-specific piecemeal optimization strategies were derived to support the coordinated enhancement of perceived safety, liveliness, and beauty. Overall, the study shows that campus perception is shaped by holistic spatial configurations rather than the simple accumulation of isolated elements, and provides a quantitative basis for iterative, feedback-oriented adaptive campus design.
Lin et al. (Mon,) studied this question.