What question did this study set out to answer?

This study aims to develop a Green AI framework that balances machine learning accuracy with energy and environmental efficiency.

July 1, 2026Open Access

Beyond accuracy: a multi-dimensional green AI framework for sustainable machine learning—energy, carbon, and performance trade-offs in SMS spam detection

Key Points

This study aims to develop a Green AI framework that balances machine learning accuracy with energy and environmental efficiency.
Evaluated 10 machine learning models on the SMS Spam Collection dataset (5,169 messages) using the Multidimensional Green AI Framework.
Assessed models across three dimensions: classification performance, operational efficiency, and environmental sustainability.
Measured metrics included MCC, F1-score, inference latency, RAM usage, model size, energy consumption, and carbon footprint.
DistilBERT demonstrated the highest performance with 99.13% accuracy but had significantly high resource costs.
Naive Bayes and Logistic Regression were effective with lower resource consumption, while XGBoost provided a good balance between accuracy and efficiency.
Training carbon emissions for DistilBERT were approximately 1,000 times higher than the most efficient models.

Abstract

Abstract Until recently, machine learning research has primarily focused on prediction accuracy, often neglecting computational efficiency and environmental impact. In this study, 10 models encompassing classical machine learning, ensemble learning, and deep learning methods were evaluated on the SMS Spam Collection dataset (5,169 messages) using the proposed Multidimensional Green AI Framework. This framework considers models in three dimensions: (i) classification performance (MCC and F1-score), (ii) operational efficiency (p95 inference latency, RAM usage, and model size), and (iii) environmental sustainability (energy consumption in Wh and carbon footprint in kg CO₂). The results reveal a clear accuracy–sustainability trade-off. Although DistilBERT achieved the highest performance (99.13% accuracy, 0.9603 MCC), its marginal gains over simpler models come at a substantial environmental and computational cost. Total pipeline time is approximately 1,720 times longer than Naive Bayes, and training carbon emissions are approximately 1,000 times higher than the most efficient models. Furthermore, CO₂ per one million inferences is 88 times greater than Logistic Regression. In contrast, classical models such as Naive Bayes and Logistic Regression demonstrated competitive performance with significantly lower resource consumption, while ensemble methods, particularly XGBoost, offered a balanced trade-off between accuracy and efficiency. These findings highlight that model selection should not rely solely on accuracy, but must also consider efficiency and environmental impact. Accordingly, this study proposes a practical and environmentally aware Green AI framework for use-specific model selection, supporting classical models for mobile/IoT environments, ensemble methods for edge computing, and deep learning approaches for cloud-based systems where higher resource consumption is acceptable.

Ask AI

Mark Helpful

Bookmark

Relay

View Full Paper