What question did this study set out to answer?

This research aims to unify and systematically review prompt and context engineering techniques in large language models (LLMs).

June 18, 2026Open Access

A comprehensive survey of prompt engineering and context engineering techniques in large language models

Key Points

This research aims to unify and systematically review prompt and context engineering techniques in large language models (LLMs).
Conducted a comprehensive systematic literature review on prompt engineering and context engineering techniques.
Developed a structured taxonomy to categorize foundational and advanced strategies for prompt management.
Evaluated techniques based on their effectiveness across different LLM architectures.
Identified foundational approaches like zero-shot and few-shot learning alongside advanced strategies such as Chain-of-Thought and Retrieval-Augmented Generation.
Emphasized the effectiveness of these techniques in reducing hallucinations and improving output consistency.
Outlined persistent limitations such as context dilution and proposed emerging research trajectories for automated prompt optimization.

Abstract

The rapid advancement of Large Language Models (LLMs) has enabled artificial intelligence to solve complex problems. However, their performance remains highly sensitive to input formulation and context management. Despite the critical role of prompt engineering, it is frequently treated as an empirical, ad hoc practice lacking a standardized theoretical framework, leading to challenges such as output inconsistency, prompt sensitivity, and hallucinations. This study presents a comprehensive systematic literature review of prompt and context engineering techniques, unifying fragmented paradigms into a cohesive analytical framework. We introduce a structured taxonomy spanning the full prompt lifecycle, categorizing foundational approaches—such as zero-shot and few-shot learning—alongside advanced strategies including Chain-of-Thought (CoT), Retrieval-Augmented Generation (RAG), and multi-step reasoning architectures (e.g., Tree-of-Thoughts, Graph-of-Thoughts). Furthermore, we critically evaluate these techniques across different LLM architectures, emphasizing their effectiveness in mitigating hallucinations, ensuring factual consistency, and aligning outputs with human intent. By analyzing the shift from input-level linguistic control to system-level contextual orchestration, we identify persistent limitations, such as context dilution and computational overhead. Finally, we outline emerging research trajectories in automated prompt optimization, reliability-aware prompting, and agentic systems. This survey serves as a foundational reference and decision-making framework for researchers and practitioners aiming to build robust, interpretable, and domain-adaptive AI systems.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper