What type of study is this?

This is a Literature Review study.

October 20, 2025Open Access

Efficient Reasoning Models: A Survey

Key Points

Shorter reasoning chains can significantly improve the efficiency of reasoning models by reducing computational overhead.
Knowledge distillation and model compression techniques enhance the reasoning capabilities of smaller language models.
Efficient decoding strategies are crucial for accelerating the inference process of reasoning models.
This survey categorizes existing work into three directions: shorter, smaller, and faster to provide a comprehensive overview.

Abstract

Reasoning models have demonstrated remarkable progress in solving complex and logic-intensive tasks by generating extended Chain-of-Thoughts (CoTs) prior to arriving at a final answer. Yet, the emergence of this "slow-thinking" paradigm, with numerous tokens generated in sequence, inevitably introduces substantial computational overhead. To this end, it highlights an urgent need for effective acceleration. This survey aims to provide a comprehensive overview of recent advances in efficient reasoning. It categorizes existing works into three key directions: (1) shorter - compressing lengthy CoTs into concise yet effective reasoning chains; (2) smaller - developing compact language models with strong reasoning capabilities through techniques such as knowledge distillation, other model compression techniques, and reinforcement learning; and (3) faster - designing efficient decoding strategies to accelerate inference of reasoning models. A curated collection of papers discussed in this survey is available in our GitHub repository: https://github.com/fscdc/Awesome-Efficient-Reasoning-Models.

Read Full Paperexternally

Perguntar à IA

Bookmark

View Full Paper