Background/Objectives: Free-text surgical pathology reports hinder automated cancer registry entry and secondary analytics. This study introduces a clinically governed schema layer for interoperability, testing whether a locally-deployable Large Language Model (LLM) pipeline can deliver robust registry-grade extraction across institutions. Methods: We developed a College of American Pathologists (CAP)-aligned clinical ontology encompassing 10 cancer types, 192 per-organ scalar fields, key biomarkers, and nested structures for lymph nodes and margins. Encoded via Declarative Self-improving Python (DSPy) signatures with grammar-constrained decoding using DSPy v3.2.1, this model-agnostic pipeline was benchmarked on 893 internal reports against a pathologist-adjudicated gold standard. External validation utilized 242 The Cancer Genome Atlas (TCGA) reports. Hardware feasibility was confirmed on a single 48-gigabyte (GB) Graphics Processing Unit (GPU), ensuring suitability for privacy-preserving on-premises deployment. Results: Using the gpt-oss-20b model, the framework achieved 92.0% macro-mean exact-match accuracy on internal data, demonstrating near-perfect run-to-run reliability. Critical prognostic indicators, including breast estrogen receptor/progesterone receptor (ER/PR) (98.7%) and margin positivity (>93%), maintained high fidelity. On the external TCGA cohort, accuracy was 77.5%, rising to 88.0% after excluding structurally silent fields absent in older narratives. Operationally, the model processed reports in 40–70 s, optimally balancing speed and accuracy. Conclusions: This schema-first abstraction layer successfully decouples clinical logic from specific Artificial Intelligence (AI) models. By reliably transforming narrative reports into machine-readable structures, it establishes a portable privacy-preserving foundation for automated cancer surveillance, institutional data reuse, and future multimodal clinical systems.
Chow et al. (Wed,) studied this question.