What question did this study set out to answer?

This survey aims to explore decoding methods for improving alignment of outputs from large language and vision-language models with user intent.

June 19, 2026

Beyond Tokens: A Survey on Decoding Methods for Large Language and Vision-Language Models

Key Points

This survey aims to explore decoding methods for improving alignment of outputs from large language and vision-language models with user intent.
Conducted a systematic review of recent works on decoding methods.
Identified three emerging paradigms in decoding approaches.
Discussed ongoing challenges and future research directions.
Highlighted the efficiency and effectiveness of decoding methods in model generation.
Emphasized improvements in user output alignment through effective techniques.
Provided resources for further exploration of decoding methods.

Abstract

Large language models (LLMs) and large vision-language models (LVLMs) have demonstrated impressive generative capabilities, yet ensuring their outputs align with user intent is still challenging. While most existing approaches address this issue at the training stage, inference-time approaches like decoding methods offer a more efficient and scalable solution. Decoding methods control model generation by guiding token-level selection, performing sequencelevel generation, or generating tokens in parallel to accelerate the process. In this survey, we identify three emerging paradigms from recent works on decoding methods for LLMs and LVLMs, provide a systematic review of these methods, highlight ongoing challenges, and discuss potential future research directions. Our goal is to underscore the efficiency and effectiveness of decoding methods and offer a practical view of their applications. Paper lists and more resources on decoding methods for LLMs and LVLMs can be found at https://github.com/wang2226/Awesome-LLM-Decoding.

Mark Helpful

Bookmark

Relay