The rapid advancement of Large Language Models (LLMs) has enabled artificial intelligence to solve complex problems. However, their performance remains highly sensitive to input formulation and context management. Despite the critical role of prompt engineering, it is frequently treated as an empirical, ad hoc practice lacking a standardized theoretical framework, leading to challenges such as output inconsistency, prompt sensitivity, and hallucinations. This study presents a comprehensive systematic literature review of prompt and context engineering techniques, unifying fragmented paradigms into a cohesive analytical framework. We introduce a structured taxonomy spanning the full prompt lifecycle, categorizing foundational approaches—such as zero-shot and few-shot learning—alongside advanced strategies including Chain-of-Thought (CoT), Retrieval-Augmented Generation (RAG), and multi-step reasoning architectures (e.g., Tree-of-Thoughts, Graph-of-Thoughts). Furthermore, we critically evaluate these techniques across different LLM architectures, emphasizing their effectiveness in mitigating hallucinations, ensuring factual consistency, and aligning outputs with human intent. By analyzing the shift from input-level linguistic control to system-level contextual orchestration, we identify persistent limitations, such as context dilution and computational overhead. Finally, we outline emerging research trajectories in automated prompt optimization, reliability-aware prompting, and agentic systems. This survey serves as a foundational reference and decision-making framework for researchers and practitioners aiming to build robust, interpretable, and domain-adaptive AI systems.
Debnath et al. (Mon,) studied this question.