What question did this study set out to answer?

The research aims to automate the staging of lung cancer by improving the analysis of radiology reports through NLP techniques.

April 1, 2026Open Access

NLI24 at the NTCIR-18 RadNLP

Key Points

The research aims to automate the staging of lung cancer by improving the analysis of radiology reports through NLP techniques.
Utilized tailored NLP techniques for processing radiology reports
Employed document segmentation to identify eight key classes
Developed an ensemble of three fine-tuned BERT-based models
Focused on automated T, N, M staging and implemented a multistage framework
Achieved a micro F2 score of 0.9433, ranking first in the competition
Obtained a joint accuracy of 0.5679 for TNM staging, finishing fourth
Demonstrated enhanced efficiency and accuracy in cancer staging process

Abstract

The management of lung cancer heavily relies on precise staging, which is traditionally derived from comprehensive radiology reports generated through imaging techniques like CT and MRI. However, these reports often lack explicit staging details, posing challenges for healthcare professionals who must manually extract relevant information. To address this issue, we propose an automated solution as part of our submission to the RadNLP (Natural Language Processing for Radiology) shared task at the NTCIR-18 international conference. Our approach utilizes tailored Natural Language Processing (NLP) techniques to enhance the processing of radiology reports. In this paper, we describe our methodology for the RadNLP subtask, which involves document segmentation to identify eight key classes within radiology reports, and the primary task, which focuses on the automated TNM staging of lung cancer. For the subtask, we employed an ensemble of three fine-tuned, hyperparameter-optimized BERT-based medical language models, which yielded an overall micro F2 score of 0.9433, securing the top rank in the competition. For the main task, we developed individual pipelines for T, N, and M staging, consisting of BERT-based models and LLMs in a multistage processing framework, resulting in a joint accuracy of 0.5679 and an overall 4th place finish in the competition. Our solution not only streamlines the extraction of critical information but also aims to improve the accuracy and efficiency of cancer staging, ultimately supporting clinical decision-making and contributing to better patient outcomes

NLI24 at the NTCIR-18 RadNLP

Key Points

Abstract

Cite This Study