This study aims to develop and evaluate a system that automatically extracts the TNM classification of lung cancer (T: primary tumor, N: lymph node metastasis, M: distant metastasis) from radiological diagnosis reports. In the initial experiments, inference was performed using `gemini-2.0-flash-thinking-exp-1219`. By incorporating explicit TNM classification criteria and unit specifications—features absent in conventional methods—and introducing error analysis and prompt improvements through meta-prompting, an overall accuracy improvement of approximately 15% was achieved after prompt modification. In the final evaluation, using the `o1 2024-12-01-preview` model, we achieved approximately 70% joint accuracy (fine), 76% T accuracy, 93% N accuracy, and 95% M accuracy. This paper provides a detailed account of the experimental procedures and the improvement process at each stage.
Okura et al. (Fri,) studied this question.