Background and Objectives: Medical diagnosis documents often exhibit diverse layouts and formats, posing significant challenges for automated information extraction. Ensuring the privacy of sensitive medical data further complicates the development of effective analysis systems. This study aims to develop a robust and privacy-compliant system for analyzing medical diagnosis documents. Methods: We designed an integrated Optical Character Recognition (OCR) system that processes medical documents regardless of their layout or format. The system first converts bitmap images into machine-readable text using OCR. A document-understanding model is then applied to identify and extract key information. To improve adaptability and accuracy, we employed a mutual learning approach. To address privacy concerns, we generated training data using generative techniques, ensuring compliance with privacy regulations while maintaining dataset quality. Results: The proposed system demonstrated strong performance across a wide variety of document layouts, effectively extracting critical information while adhering to privacy requirements. Conclusions: Our approach offers a practical and efficient solution for processing complex medical diagnosis documents, advancing the field of medical informatics while safeguarding patient privacy.
Building similarity graph...
Analyzing shared references across papers
Loading...
Hung-Jen Tu
Jia-Lien Hsu
Diagnostics
Fu Jen Catholic University
Building similarity graph...
Analyzing shared references across papers
Loading...
Tu et al. (Fri,) studied this question.
synapsesocial.com/papers/6930e8d7ea1aef094cca3b94 — DOI: https://doi.org/10.3390/diagnostics15233039