Abstract Key Information Extraction (KIE) systems based on Deep Learning achieve strong token-level performance but offer no formal guarantees on prediction reliability, limiting their adoption in business-critical document workflows. In this work, we introduce a post hoc Uncertainty Quantification framework for KIE using Split Conformal Prediction (CP). After fine-tuning multimodal transformer models on a challenging receipt dataset, we reserve a held-out calibration set to derive nonconformity scores and construct entity-level prediction sets that satisfy a user-specified error rate. On unseen receipts, CP achieves tight marginal coverage (98. 3% for =0. 02 α = 0. 02), with 70% of predictions being high-confidence singletons. A detailed analysis shows that highly structured fields such as dates and prices yield small, singleton sets with near–perfect reliability, whereas rare or semantically ambiguous fields such as tips or generic keywords produce larger sets and lower coverage. By exposing positional biases and common label confusions that standard F1-scores and document-accuracy metrics overlook, CP reveals critical risk areas for downstream automation. Finally, we demonstrate how calibrated prediction-set sizes can drive risk-aware workflows by automatically processing high-confidence extractions and flagging uncertain cases for human review, thereby enhancing the efficiency, trustworthiness and operational feasibility of real-world document-processing systems.
Building similarity graph...
Analyzing shared references across papers
Loading...
Alexander Rombach
Nijat Mehdiyev
International Journal on Document Analysis and Recognition (IJDAR)
Building similarity graph...
Analyzing shared references across papers
Loading...
Rombach et al. (Tue,) studied this question.
www.synapsesocial.com/papers/69b25aca96eeacc4fcec8cdf — DOI: https://doi.org/10.1007/s10032-026-00572-y
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: