What question did this study set out to answer?

The aim is to assess how model-based data integration can enhance species distribution models for hummingbirds with limited occurrence data.

March 26, 2026Open Access

Model‐Based Data Integration Improves Species Distribution Models for Data‐Deficient and Narrow‐Ranged Hummingbird Species

Key Points

The aim is to assess how model-based data integration can enhance species distribution models for hummingbirds with limited occurrence data.
Applied a data integration technique for modeling distributions of 98 hummingbird species.
Fitted species distribution models using presence-absence and presence-only data from eBird and GBIF.
Utilized generalized linear mixed-effects models validated through spatial block cross-validation and expert range maps.
Conducted an experiment with thinned datasets of 47 abundant species to evaluate model performance under varying data deficiency levels.
Data integration significantly improved model accuracy for small-ranging and data-deficient species compared to presence-absence models.
Integration showed minimal differences compared to traditional presence-absence models.
Small amounts of presence-only data enhanced predictive accuracy in model integration.
Overall, data integration improved models for all species, particularly data-rich species with larger ranges.

Abstract

ABSTRACT Aim For species with narrow ranges or low population sizes, a deficiency of species occurrence records can limit the capacity to build accurate species distribution models (SDMs). Model‐based integration of data from multiple sources has been offered as a solution to improve predictions of species' distributions at large scales, especially for data‐deficient species, but clear empirical demonstrations for this are lacking. Location South and Central America. Methods We applied a state‐of‐the‐art data integration technique to model the distributions of 98 hummingbird species. We fitted SDMs using either presence‐absence (PA) data from eBird or presence‐only (PO) data from eBird and the Global Biodiversity Information Facility (GBIF) and compared them to integrated SDMs, which utilise both PA and PO data. We fitted generalised linear mixed‐effects models and validated them with spatial block cross‐validation and expert range map adjusted validation. We also conducted an experiment using artificially thinned datasets of 47 abundant enough species to assess model performance under different levels of data deficiency. Results Data integration improved model performance compared with PA models for small‐ranging and data‐deficient species for which PA data covered poorly the environmental conditions in the study area. Still, the difference to PA models was very small. The thinning experiment showed that even a small amount of PO data in data integration improved the predictive accuracy in comparison to PA models. In comparison to PO models, data integration improved models over all species, but especially for data‐rich species with large geographical ranges. Main Conclusions Overall, data integration enables a more comprehensive capture of available species information and can improve range predictions in comparison to conventional modelling methods.

Model‐Based Data Integration Improves Species Distribution Models for Data‐Deficient and Narrow‐Ranged Hummingbird Species

Key Points

Abstract

Cite This Study