What question did this study set out to answer?

The aim is to improve forecasting of count data by addressing limitations of existing models related to varying dispersion.

February 19, 2026Open Access

Forecasting Count Data With Varying Dispersion: A Latent‐Variable Approach

Key Points

The aim is to improve forecasting of count data by addressing limitations of existing models related to varying dispersion.
Survey existing count data modeling approaches.
Develop a latent-variable model based on a discrete log-normal distribution.
Create scalable algorithms for fitting the model efficiently.
Achieved a 98% reduction in computation time compared to a leading competitor.
The discrete log-normal model showed lower squared prediction error in grocery sales forecasting.
Produced prediction intervals that were 24% narrower on average than a negative binomial model without compromising coverage.

Abstract

ABSTRACT Count data, such as product sales and disease case counts, are common in business forecasting and many areas of science. Although the Poisson distribution is the best known model for such data, its use is severely limited by its assumption that the dispersion is a fixed function of the mean, which rarely holds in real‐world scenarios. This assumption is especially problematic in forecasting because it compromises risk assessment with unreliable estimates of predictive uncertainty. While various alternatives exist, they have significant limitations of their own: They may work only in specific settings (such as overdispersed data), lack interpretability, or require extensive computation, rendering them unusable on data sets of moderate to large size. In this paper, we survey the current landscape of count data modeling in the presence of varying dispersion, develop extensions to previously proposed methods, and introduce a computationally efficient latent‐variable approach based on a discrete log‐normal distribution. The method allows for unconstrained dispersion, ranging from underdispersion to overdispersion (relative to the Poisson model), that may vary as a function of covariates. We develop scalable algorithms for fitting this model, resulting in a 98% reduction in computation time in our simulation relative to a leading competitor: a Conway–Maxwell–Poisson regression model. We compare the methods in simulation and a case study in the area of grocery sales forecasting. In the case study, the discrete log‐normal model outperforms the other methods in terms of squared prediction error. Compared to a negative binomial model, it produces prediction intervals that are 24% narrower on average without a commensurate decrease in coverage. In practice, these improvements would enable more precise risk assessment and resource allocation in business, epidemiology, and other domains.

Forecasting Count Data With Varying Dispersion: A Latent‐Variable Approach

Key Points

Abstract

Cite This Study