What question did this study set out to answer?

The goal is to evaluate the effectiveness of various predictive models for data center power consumption.

April 4, 2026Open Access

Comparative Evaluation of Machine Learning and Temporal Models for Data Center Power Prediction

Key Points

The goal is to evaluate the effectiveness of various predictive models for data center power consumption.
Used Google cluster trace data with 8,927 samples at 5-minute intervals
Investigated multiple predictive approaches including naive temporal baselines, linear regression, and Random Forest
Employed a time-based train-test split to maintain temporal dependencies
Evaluated model performance using Root Mean Squared Error (RMSE)
Naive temporal model achieved the lowest prediction error, outperforming other approaches
Linear regression showed comparable performance to machine learning models
Random Forest slightly underperformed compared to linear regression
Traditional time-series models exhibited significantly higher error rates, indicating limited predictive capability
Temporal stability is crucial for power behavior in low-utilization environments

Abstract

This study presents a comparative evaluation of machine learning and temporal models for predicting power consumption in data centers. Using real-world Google cluster trace data comprising 8,927 temporally ordered samples, we investigate the effectiveness of multiple predictive approaches, including naive temporal baselines, linear regression, Random Forest, and classical time-series models such as AR(1) and ARIMA. The dataset is constructed using aggregated 5-minute intervals with features including CPU utilization, lagged CPU values, lagged power, and derived feature transformations. A time-based train-test split is employed to preserve temporal dependencies and avoid data leakage. Model performance is evaluated using Root Mean Squared Error (RMSE). Experimental results show that a simple naive temporal model achieves the lowest prediction error, outperforming both machine learning and classical time-series approaches. Linear regression demonstrates comparable performance, while Random Forest slightly underperforms. Traditional time-series models and CPU-only models exhibit significantly higher error, indicating limited predictive capability. These findings suggest that temporal stability plays a dominant role in data center power behavior, particularly in low-utilization environments, and that workload features alone provide limited predictive value. The study highlights the importance of baseline temporal patterns over model complexity for accurate power prediction.

Comparative Evaluation of Machine Learning and Temporal Models for Data Center Power Prediction

Key Points

Abstract

Cite This Study