March 3, 2026Open Access

GreenAI: A Comparative Analysis of Environmental Efficiency in LLM-Generated Code

Key Points

This comparative analysis shows varying levels of environmental efficiency across large language models.
The GreenAI Efficiency Score ranks models based on execution time, memory usage, energy consumption, and carbon footprint.
A multi-metric framework was utilized, incorporating the TOPSIS method for effective model comparison.
Findings underscore the importance of selecting environmentally friendly AIs for software development.

Abstract

The increasing usage of large language models for code generation raises concerns regarding their computational costs and ecological impact. This study evaluates the environmental efficiency of several cutting-edge large language models, including ChatGPT, Claude, Copilot, DeepSeek, Gemini, Mistral, and Qwen, across algorithm and data structure tasks in Python, C++, and Java, selected from HackerRank to ensure practical relevance. A multi-metric, sustainability-focused evaluation framework is proposed, measuring execution time, peak memory usage, energy consumption, and carbon footprint. The Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) is applied to combine algorithm and data structure metrics into scores for each programming language, which are then normalized across models and averaged across languages to compute the GreenAI Efficiency Score. This unified score enables fair, comprehensive ranking of models, promoting environmentally responsible AI selection in software development.

GreenAI: A Comparative Analysis of Environmental Efficiency in LLM-Generated Code

Key Points

Abstract

Cite This Study

Also Consider

Also Consider