AbstractObjective To determine whether large language models (LLMs) can automatically extract organ-level disease involvement to populate the Surgical Findings section of the European Society of Gynaecological Oncology (ESGO) Operative Report for advanced ovarian cancer. Methods We retrospectively collected 300 operative notes from cytoreductive surgeries performed at a tertiary ESGO-accredited center. Each note was interrogated to identify disease involvement across 35 predefined ESGO anatomical sites. For each site, LLMs were tasked to classify whether disease was present. Their accuracy was compared with expert annotations using F1 scores. Four modern models were selected based on their state-of-the-art performance and suitability for clinical text interpretation. Operative notes were converted into sets of binary (yes/no) questions corresponding to each anatomical sites. Models were tested both in their basic form and after targeted enhancement strategies to reduce common errors. These enhancements included adding a clinical terminology list, providing clearer task instructions, and showing a small number of examples. Results The models showed good baseline accuracy, with the two top performing systems achieving F1-scores of 0.851 (95% CI: 0.841–0.861) and 0.864 (95% CI: 0.854–0.873). Following optimization strategies, accuracy increased further, reaching 0.897 (95% CI: 0.888–0.906) and 0.875 (95% CI: 0.866–0.884). Performance was highest for clinical key sites, including omentum, right diaphragm (95%), and ovaries (92%). Lower accuracy was observed for complex anatomical sites such as bowel (small 73%, large 61%) and peritoneal sites (pouch of Douglas 82%, abdominal wall 68%). Frequent errors involved laterality, overlapping anatomical regions, and ambiguous abbreviations. Optimization strategies improved distinction between closely related sites (rectosigmoid vs large bowel/mesentery) and reduced left/right errors. Conclusion With enhancement strategies, LLMs demonstrated near-human performance in extracting ESGO-compliant operative information. Integrating model-assisted extraction into surgical workflows may reduce reporting time, improve completeness, and help standardize operative documentation.
Laios et al. (Sun,) studied this question.