Understanding the complex relationship between the built environment and urban vitality is essential for evidence-based urban renewal. However, most existing studies rely on linear regression models that fail to capture the non-linear threshold effects inherent in urban systems and depend on costly proprietary datasets that limit reproducibility. This study proposes a scalable, open-data-driven framework to decode the non-linear mechanisms governing population-based urban vitality in Zhengzhou, a rapidly regenerating metropolis in Central China. Using Areas of Interest (AOIs) as functional spatial units to mitigate the Modifiable Areal Unit Problem (MAUP), we construct a multidimensional built environment indicator system (5D+S: Density, Diversity, Design, Distance to Transit, Destination Accessibility, and Surroundings) from multi-source open data, including 100 m WorldPop population grids, OpenStreetMap building vectors, Points of Interest (POIs), and transit station data. An explainable machine learning approach combining XGBoost with SHapley Additive exPlanations (SHAP) is employed to identify the relative importance of built environment factors and quantify their non-linear threshold effects on population-based urban vitality (operationally defined as residential population density derived from WorldPop 100 m grids). Across 3920 AOIs, XGBoost (R2 = 0.846, RMSE = 0.104) substantially outperforms Ordinary Least Squares regression (R2 = 0.634), confirming pervasive non-linear relationships, with stable 5-fold cross-validated R2 = 0.713 ± 0.115. SHAP analysis reveals four dominant drivers: Distance to Commercial Core (DistCBD), Bus Station Density within 500 m (BusDen500), Green Coverage Ratio (GreenRatio), and Building Density (BD). Critical thresholds are identified: vitality contributions decay sharply beyond approximately 4.3 km from the CBD; at least 4 bus stations within 500 m are required for meaningful transit benefit; building density delivers positive returns within a 2–30% range; and excessive green coverage above 8.5% within 500 m is associated with declining population-based vitality, a finding that reflects spatial competition between ecological land use and residential density rather than a negative effect of greenery per se. These findings provide quantitative design guidelines for precision urban renewal, moving beyond “the more, the better” planning assumptions to identify optimal intervention ranges.
Lu et al. (Mon,) studied this question.