What question did this study set out to answer?

The research aims to develop predictive models for estimating SCA scores based on processing and production data in specialty coffee.

March 3, 2026Open Access

Prediction of SCA Scores in Specialty Coffee Using Machine Learning.

Key Points

The research aims to develop predictive models for estimating SCA scores based on processing and production data in specialty coffee.
Developed models using Random Forest and XGBoost algorithms
Collected data on processing variables between 2019 and 2023
Applied Principal Component Analysis and variable selection methods
The Random Forest model with all variables achieved MAE of 0.80 and R² of 0.53
Models with seven predictors showed similar performance with MAE of 0.81 and R² of 0.50
PCA-based models performed less effectively compared to variable selection models

Abstract

Coffee is a major global commodity, with specialty coffees valued for their quality, assessed through standardized sensory protocols. The SCA (Specialty Coffee Association) score is a key indicator of commercial value, but sensory evaluation is resource-intensive and subject to variability. This study developed predictive models to estimate SCA scores from processing and production-related variables collected between 2019 and 2023, covering reception, fermentation, pulping, washing, drying, storage, and contextual production information. Random Forest (RF) and XGBoost (XGB) regression algorithms were applied using three approaches: complete variable set, Principal Component Analysis (PCA), and selection of the seven most relevant variables. The RF model with all variables achieved the best performance (MAE = 0.80; RMSE = 1.03; R2 = 0.53). However, models using only seven predictors achieved nearly equivalent results (MAE = 0.81; RMSE = 1.06; R2 = 0.50), with RF and XGB showing RMSE around 1.05 and R2 above 0.50. PCA-based models performed worse. In conclusion, variable selection proved more efficient and robust than PCA, enabling moderate but practically relevant prediction of SCA scores with reduced model complexity in specialty coffee production. PRACTICAL APPLICATIONS: This research shows that machine learning models can help predict coffee quality scores using processing data. Such tools may support producers and cooperatives in monitoring quality earlier and more efficiently, reducing reliance on extensive sensory tests and improving decision-making in specialty coffee production.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Ferraz et al. (Sun,) studied this question.

synapsesocial.com/papers/69a67efaf353c071a6f0ab73 https://doi.org/https://doi.org/10.1111/1750-3841.70946

Bookmark

View Full Paper