Accurate spatial–temporal crime prediction is a critical component of proactive public safety governance, yet it remains challenging due to complex dependency structures and severe data sparsity in real-world crime datasets. Most existing methods either focus on local spatial–temporal correlations or attempt to model global dependencies at fine-grained region levels, which limits their robustness under highly sparse and imbalanced crime distributions. In this paper, we propose GL-MoPA, a global–local modulated prototype attention framework for city-scale crime prediction. GL-MoPA integrates three key components. First, a local dependency modeling module is designed to capture fine-grained spatial and short-term temporal patterns. Second, a prototype-aware global attention mechanism aggregates region-level representations into semantically meaningful prototypes to efficiently model long-range dependencies. Third, a two-stage occurrence-aware prediction strategy decouples crime occurrence estimation from intensity regression to explicitly address data sparsity. We evaluate GL-MoPA on a real-world crime dataset from New York City covering four major crime categories. The experimental results show that GL-MoPA achieves state-of-the-art performance, consistently outperforming both classical statistical models and recent deep learning baselines. In particular, a robustness analysis shows substantial error reductions in sparse regions, while ablation studies reveal the complementary roles of individual model components. These results indicate that GL-MoPA provides an effective and robust solution for spatial–temporal crime forecasting under sparse-data scenarios.
Zhao et al. (Sat,) studied this question.