Accurate monthly sales forecasting is a critical challenge for retail businesses worldwide, yet most organisations still rely on manual spreadsheet methods. This study presents the development of a unique forecasting model combining time series decomposition, comprehensive feature engineering, and gradient boosting machine learning. The model was developed and validated on real operational monthly sales data from a Kazakhstani retail business — a market that has received virtually no attention in the academic forecasting literature. A two-phase methodology is adopted: first, Pearson and Spearman correlation tests mathematically characterise the temporal structure of the data; second, a feature engineering framework combining sales history lags, rolling statistics, cyclic calendar encoding, price and promotional variables, inventory features, and locally specific indicators (including Ramadan) is built and evaluated on a held-out test set. LightGBM achieves the strongest performance, explaining 81% of total sales variance (R² = 0.81, RMSE = 1,996), substantially outperforming all classical approaches. Feature importance analysis reveals that incoming stock volume, product category, and selling price are more predictive than purely temporal features. A key design principle of the model is its adaptability: the feature set is designed to be extended for any country or region by incorporating locally relevant calendar events, holidays, and cultural demand drivers. The authors plan to release this model as a publicly available forecasting application suitable for businesses globally.
Bibinur et al. (Fri,) studied this question.