Abstract Clinical diagnosis in the real world often begins with ambiguous patient complaints that require iterative reasoning and testing. While large language models (LLMs) increasingly assist with specific medical queries, they currently lack the ability to autonomously drive this entire diagnostic workflow, limiting their potential to significantly alleviate physician workload. Here we present DxDirector-7B, an agentic LLM designed to navigate the full diagnostic process through advanced slow thinking capabilities. Unlike existing assistants, our model autonomously determines optimal diagnostic strategies, requesting physician intervention only for necessary clinical operations. In evaluations spanning rare diseases and complex real-world cases, DxDirector-7B achieves superior diagnostic accuracy compared to state-of-the-art medical and general-purpose LLMs with significantly larger parameters. Crucially, it drastically reduces physician involvement while maintaining a robust safety and accountability framework for high-risk conditions. These results demonstrate a paradigm shift where AI effectively leads clinical reasoning, offering a scalable solution to enhance diagnostic efficiency and accessibility.
Xu et al. (Thu,) studied this question.