What question did this study set out to answer?

This work aims to map current training methods for large language models and identify challenges in the field.

February 21, 2026Open Access

Training Methods for Large Language Models: Current Approaches and Challenges

Key Points

This work aims to map current training methods for large language models and identify challenges in the field.
Conducted a systematic mapping study of contemporary training methodologies for LLMs.
Emphasized transformer-based architectures, optimization objectives, and data curation strategies.
Analyzed various training approaches including parameter-efficient fine-tuning and multimodal training techniques.
Described key training methodologies and organized them into a comparative taxonomy.
Identified challenges like scalability, bias amplification, and hallucination.
Highlighted emerging directions, including reasoning-centric training strategies.

Abstract

Large Language Models (LLMs) have emerged as a dominant paradigm in natural language processing, demonstrating strong performance across a wide range of generation and reasoning tasks. These systems depend on multi-stage training pipelines that integrate large-scale self-supervised pre-training, supervised fine-tuning, and alignment techniques. This paper presents a systematic mapping study of contemporary LLM training methodologies, emphasizing transformer-based architectures, optimization objectives, and data curation strategies as well as emerging sparse architectures such as Mixture-of-Experts (MoE) models. We analyze parameter-efficient fine-tuning approaches, retrieval-augmented generation frameworks, and multimodal training techniques, which we organize into a unified comparative taxonomy. We discuss key technical challenges such as scalability constraints, hallucination, bias amplification, and alignment–capability tradeoffs, then identify emerging research directions such as reasoning-centric training. This work provides a concise technical reference for researchers and practitioners working on scalable and reliable language model training.

Read Full Paperexternally

AI에게 질문

Bookmark

View Full Paper