March 3, 2026Open Access

Evaluation of electronic health record-integrated artificial intelligence chart review

Key Points

Physicians provided feedback on 147 AI-generated summaries, revealing overall positive impressions despite some concerns.
Positive feedback was common with 71 instances, but users noted issues like omissions and confusing content.
Analysis used Cohen's Kappa, yielding a score of 0.64, indicating substantial agreement among reviewers.
Although not perfect, AI-assisted tools can enhance clinical workflows with necessary adjustments.

Abstract

Abstract This study evaluates the quality of artificial intelligence (AI) clinical note summarization by analyzing physician qualitative feedback on a large language model (LLM) chart review tool integrated into the electronic health record (EHR). Physicians provided free-text feedback on AI-generated chart summaries, which physician informaticists analyzed using MAXQDA. Feedback from 10 physicians was collected on 147 AI-generated summaries. Positive feedback was common ( n = 71), but users identified omissions ( n = 46), confusing content ( n = 20), token limitations ( n = 27), hallucinations ( n = 5), and bias ( n = 1). Cohen’s Kappa was 0.64, indicating substantial reviewer agreement. Physician feedback on the tool revealed overall positive impressions, though omissions raised concerns about summary completeness. AI-assisted chart review technology is not infallible, but physicians found this tool acceptable for use in clinical workflows.

Bookmark

View Full Paper

Bookmark

View Full Paper

Evaluation of electronic health record-integrated artificial intelligence chart review

Key Points

Abstract

Cite This Study