Large language model (LLM) research has grown rapidly, along with increasing concern about their limitations. In this survey, we conduct a data-driven, semi-automated review of research on limitations of LLMs ( LLLMs ) from 2022 to early 2025 using a bottom-up approach. From a corpus of 250,000 ACL and arXiv papers, we identify 14,648 relevant papers using keyword filtering, LLM-based classification, validated against expert labels, and topic clustering (via two approaches, HDBSCAN+BERTopic and LlooM). We find that the share of LLM-related papers increases over fivefold in ACL and nearly eightfold in arXiv between 2022 and 2025. Since 2022, LLLMs research grows even faster, reaching over 30% of LLM papers by 2025. Reasoning remains the most studied limitation, followed by generalization , hallucination , bias , and security . The distribution of topics in the ACL dataset stays relatively stable over time, while arXiv shifts toward security risks , alignment , hallucinations , knowledge editing , and multimodality . We offer a quantitative view of trends in LLLMs research and release a dataset of annotated abstracts and a validated methodology, available at: github.com/a-kostikova/LLLMs-Survey.
Kostikova et al. (Sat,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: