What question did this study set out to answer?

To quantify the energy consumption of inference in large language models (LLMs).

January 14, 2026

Measuring Energy Consumption of LLMs Inferences

Key Points

To quantify the energy consumption of inference in large language models (LLMs).
Measured energy use across various transformer model architectures.
Utilized high-performance computing resources for accurate assessments.
Applied state-of-the-art inference frameworks and high-end GPUs.
Quantified substantial energy consumption associated with LLM inference.
Identified environmental impacts potentially exceeding those of model training.

Abstract

Large Language Models (LLMs) offer unprecedented language understanding and generation capabilities. However, these advancements come at a cost. LLMs rely heavily on high-performance computing, not only for training but also for inference. This dependence translates into substantial energy consumption, with inference potentially exceeding the already significant environmental impact of training, which can generate thousands of tons of CO2eq. Therefore, quantifying the energy consumption of LLM inference is crucial. This research focuses on measuring this energy use across a wide range of transformer models, from smaller architectures to cutting-edge models like DeepSeek V3/R1. We utilize state-of-the-art inference frameworks and high-end GPUs to ensure accurate assessments.

Demander à l'IA

Bookmark