Clinical data governance is the cornerstone of reliable intelligent healthcare systems. However, real-world clinical records frequently suffer from complex data quality issues that demand high semantic fidelity and processing efficiency to resolve. Existing section identification and fragmented standardization methods either fail to address these intricate anomalies or inadvertently sacrifice semantic integrity. Meanwhile, directly deploying Large Language Models (LLMs) for this task as free-form text generators introduces hallucinations and computational bottlenecks. To bridge these gaps, we propose GovernAgent, an LLM-driven framework that overcomes these limitations through two core designs. First, inspired by the intrinsic structure of clinical records, our approach introduces a hierarchical governance mechanism. By employing cascading Note- and Section-Level agents, it constrains the governance space in a top-down manner, systematically disentangling these anomalies into resolvable, multi-level quality issues. Second, the framework employs a Constrained Action Planning mechanism. By restricting the LLM to a hybrid "Copy-Generate" action space rather than free-text generation, it maximizes original text reuse, thereby mitigating hallucinations, guaranteeing medical provenance, and ensuring high efficiency. Evaluations on real-world hospital datasets demonstrate that GovernAgent improves governance accuracy and efficiency, minimizes hallucinations, demonstrates high practical adaptability, and empowers downstream clinical applications. Code: https://github.com/kaiyinzhou/GovernAgent.
Zhou et al. (Thu,) studied this question.