What type of study is this?

This is a Literature Review study.

September 17, 2025Open Access

Large Language Models in Document Intelligence: A Comprehensive Survey, Recent Advances, Challenges and Future Trends

Key Points

Large language models have transformed document intelligence, improving accuracy in document processing.
The analysis covers about 300 papers from 2021 to mid-2025, indicating significant advancements in the field.
Key topics include retrieval-augmented generation and fine-tuning, essential for document comprehension.
Current challenges and future directions are emphasized, supporting both researchers and industry practitioners.

Abstract

The rapid proliferation of documents has made document intelligence increasingly critical across various industries. In recent years, large language models (LLMs) have dramatically transformed the field of document intelligence, allowing for more advanced and accurate document processing solutions. Despite these advancements, most existing surveys have failed to focus on these breakthroughs, instead concentrating on traditional methods and earlier machine learning techniques. This survey seeks to fill that gap by offering an in-depth analysis of approximately 300 papers published between 2021 and mid-2025, thus providing a comprehensive overview of the impact of LLMs in document intelligence. The key topics explored include retrieval-augmented generation (RAG), long context processing, and fine-tuning LLMs for document comprehension. Furthermore, the survey highlights essential datasets, practical applications, current challenges, and future research directions, offering critical insights for both researchers and industry practitioners looking to advance the field.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Wenjun Ke

Southeast University

Yifan Zheng

Edinburgh College

Youlan Li

Hanyang University

Journals

ACM transactions on office information systems

Actions

Institutions

Southeast University

University of Macau

Metacomp Technologies (United States)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Large Language Models in Document Intelligence: A Comprehensive Survey, Recent Advances, Challenges and Future Trends

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study