What question did this study set out to answer?

The review aims to survey current large language models and their architectures, including methods of fine-tuning and applications.

April 22, 2026Open Access

Large Language Models: Architectures, Fine-Tuning, and Retrieval-Augmented Generation — A Comprehensive Review

Key Points

The review aims to survey current large language models and their architectures, including methods of fine-tuning and applications.
Survey of transformer-based architectures including BERT, GPT, and T5.
Discussion of fine-tuning techniques and retrieval-augmented generation strategies.
Overview of challenges and future research directions.
Highlights advancements in model architectures and pre-training techniques.
Identifies key challenges like hallucination and computational cost.
Examines implications of alignment techniques and parameter-efficient fine-tuning.

Abstract

Large Language Models (LLMs) have emerged as the dominant paradigm for natural language understanding and generation, progressing from encoder-only and encoder–decoder transformers to frontier decoder-only models with tens to hundreds of billions of parameters. This review presents a comprehensive survey of modern LLMs, organised around five interrelated themes: (i) transformer-based architectures and major model families including BERT, GPT, T5, LLaMA, Mistral, Claude, and Gemini; (ii) pre-training paradigms, scaling laws, and data curation practices; (iii) fine-tuning strategies with particular emphasis on parameter-efficient methods such as LoRA, QLoRA, adapters, and prefix tuning; (iv) retrieval-augmented generation (RAG) pipelines that ground LLM outputs in external knowledge; and (v) alignment techniques including supervised fine-tuning, reinforcement learning from human feedback (RLHF), and direct preference optimisation (DPO). We additionally cover inference-time efficiency, evaluation benchmarks, and real-world applications. The review concludes with key challenges such as hallucination, safety, reasoning limits, and computational cost, and highlights future research directions including mixture-of-experts, long-context modeling, and multimodal extensions.

Read Full Paperexternally

KI fragen

Bookmark

View Full Paper