Objective This study explored how hybrid paper-to-digital approaches could reduce time and effort for data capture and reporting in settings with low electronic health record (EHR) adoption while increasing the volume and diversity of routine patient-level clinical data within health information systems. Methods and analysis We used human-centred design to co-design two components with nurses, doctors and records officers from newborn units in eight geographically dispersed Kenyan hospitals: (1) machine-readable versions of routinely used patient transfer, admission and discharge forms (February–October 2024) and (2) a minimum viable paper-to-digital clinical data pipeline (November 2024–July 2025). Participating hospitals piloted the machine-readable paper forms in routine clinical settings between November 2024 and May 2025 with all newborn unit admissions eligible for inclusion. We used artificial intelligence (AI) models within the co-designed pipeline to auto-extract clinical data from forms, with models being continuously evaluated and updated. Results Of the 7118 patients admitted to newborn units across participating hospitals, 6482/7118 (91.1%) had at least one designed form in their patient files, with 1615 clinicians of different cadres having used the forms. The clinical data pipeline extracted checkbox fields with aggregate scores of 99.25% (95% CI 98.98% to 99.45%) for the area under the receiver operating characteristic curve (AUCROC), 98.91% (95% CI 98.49% to 99.21%) for F1-score, 98.85% (95% CI 98.32% to 99.22%) for positive predictive value and 99.02% (95% CI 98.62% to 99.30%) for sensitivity. For the free-text fields, the aggregate scores for exact accuracy, character error rate and word error rate across all hospitals were 95.66% (95% CI 94.24 to 96.74%), 1.66% (95% CI 1.21% to 2.26%) and 4.34% (95% CI 3.26% to 5.76%), respectively. The hybrid approach reduced human time required for manual data extraction by 50% and forms requiring data correction by 60%. Conclusion Our co-designed novel paper-to-digital pipeline with AI-based extraction from routine paper records achieved high accuracy and substantial efficiency gains, offering a scalable solution for generating patient-level data in routine clinical settings in Sub-Saharan Africa where EHRs remain out of reach or are difficult to implement.
Tuti et al. (Fri,) studied this question.