The low-altitude airspace of bird flocks is gradually shared by unmanned aerial vehicles (UAVs), posing safety risks that necessitate accurate trajectory forecasting. However, existing vision-based methods often treat trajectory prediction and UAV detection as separate tasks, assume light-tailed Gaussian noise, and rely on heavy backbones. These limitations, when applied to bird trajectory forecasting, limit uncertainty calibration and embedded deployment in ground-based monocular surveillance. In this work, we propose a unified framework for low-altitude monitoring. Its core, Mini-BirdFormer, combines a lightweight Transformer encoder with a Student-t mixture density head to model heavy-tailed flight dynamics and produce calibrated uncertainty. Experiments on a real-world dataset show the model achieves strong long-horizon performance with only 1.05 million parameters, attaining a minADE of 0.785 m and reducing negative log-likelihood from 1.25 to −2.01 (lower is better) compared with a Gaussian Long Short-Term Memory (LSTM) baseline. Crucially, it enables low-latency inference on resource-constrained platforms at 616 FPS. Additionally, a system-level extension supports zero-shot UAV detection via open-vocabulary learning, attaining 92% recall without false alarms. Results demonstrate that combining heavy-tailed probabilistic modeling with a compact backbone provides a practical, deployable approach for monitoring shared airspace.
Song et al. (Sun,) studied this question.