Large Language Models (LLMs) offer unprecedented language understanding and generation capabilities. However, these advancements come at a cost. LLMs rely heavily on high-performance computing, not only for training but also for inference. This dependence translates into substantial energy consumption, with inference potentially exceeding the already significant environmental impact of training, which can generate thousands of tons of CO2eq. Therefore, quantifying the energy consumption of LLM inference is crucial. This research focuses on measuring this energy use across a wide range of transformer models, from smaller architectures to cutting-edge models like DeepSeek V3/R1. We utilize state-of-the-art inference frameworks and high-end GPUs to ensure accurate assessments.
Building similarity graph...
Analyzing shared references across papers
Loading...
Francisco Caravaca
ACM SIGMETRICS Performance Evaluation Review
Universidad Carlos III de Madrid
Building similarity graph...
Analyzing shared references across papers
Loading...
Francisco Caravaca (Fri,) studied this question.
www.synapsesocial.com/papers/6966e74713bf7a6f02c000aa — DOI: https://doi.org/10.1145/3788882.3788890