What question did this study set out to answer?

May 24, 2026Open Access

Exploring the role of Large Language Models in High-Performance Computing programming: A survey

SLStrahinja LjaljevicUniversitat Oberta de Catalunya JJJosep JorbaUniversitat Oberta de Catalunya SISergio IserteBarcelona Supercomputing Center

Key Points

This research aims to assess the role of large language models in high-performance computing programming and their effectiveness across various applications.
Systematic review of LLM applications in HPC categorizing into code generation, optimization, frameworks, evaluation, and challenges.
Analysis of domain-specialized models versus general-purpose LLMs based on performance across tasks like OpenMP and MPI.
General-purpose LLMs perform adequately on serial tasks but struggle with critical distributed paradigms like MPI.
Domain-specialized models demonstrate higher accuracy but are limited in scope and evaluations focused on micro-benchmarks.
LLMs can facilitate prototyping and code modernization, yet they are not ready to substitute HPC experts due to concerns over correctness and scalability.

Abstract

Large Language Models (LLMs) are emerging as promising assistants in High-Performance Computing (HPC), where programming remains complex and expertise-intensive. This survey systematically reviews their application across five categories: code generation, parallelization and optimization, frameworks and architectures, evaluation and benchmarking, and broader challenges. The analysis highlights both opportunities and limitations: while general-purpose LLMs perform reasonably well on serial and OpenMP-like tasks, they fall short in distributed paradigms such as MPI, where correctness and scalability are critical. Domain-specialized models (e.g., HPC-Coder, HPC-GPT, chatHPC) achieve higher accuracy through fine-tuning, curated datasets, and retrieval-augmented generation (RAG), yet their scope remains narrow and their evaluations largely limited to benchmarks or micro-kernels. The broader picture is one of dual potential and fragility: LLMs can lower barriers to entry, accelerate prototyping, and support code modernization, but they remain brittle under production-level requirements where correctness, performance portability, and scaling cannot be compromised. We conclude that LLMs are unlikely to replace HPC experts in the near term but are positioned to become powerful collaborators in the software development pipeline. Their effective deployment will require richer datasets, integration with performance analysis and schedulers, rigorous evaluation frameworks, and governance structures that ensure transparency and trust. The convergence of AI and HPC should therefore be understood as a long-term, co-evolutionary process—where each advance uncovers new challenges and opportunities for reshaping scientific software development.

Ask AI

Helpful

Bookmark

View Full Paper