What type of study is this?

This is a Quantitative Study study.

September 19, 2025

Abstract C036: From pixels to prognosis: Attention-based chain of thought reasoning for automated spine cancer image analysis

Key Points

Enhanced accuracy of caption generation for spine cancer images improved nearly threefold using a Chain-of-Thought process.
BLEU-4 scores for model outputs increased significantly, indicating improved alignment between AI captions and expert annotations.
Adopting a five-step Chain-of-Thought reasoning approach allows for transparent AI processing, boosting clinician trust in image evaluations.
Attention mechanisms provide insights into AI decision-making, allowing for traceability of errors and reinforcing regulatory compliance.

Abstract

Abstract Current vision– Large Language Models (V-LLMs) for spinal oncology imaging use a black-box approach towards generating caption from an MRI/CT scan image. This conflicts with what a real radiologist goes through when interpreting the same images. These models take the pixels, traverse a billion plus parameter latent space, and generate a caption in one shot. This “all-at-once” approach ignores the multi-pass workflow that expert radiologists follow—checking image quality and vertebral levels, mapping baseline anatomy, surveying for disease, measuring epidural tumor extension and spinal-canal compromise and then integrating these findings into a structured report that drives surgical or radiotherapy decisions. Because the vision model’s intermediate reasoning remains hidden, clinicians cannot confirm that the AI model has examined every clinically critical cue, they can also not trace the origin of potential errors when the AI model misses a small sacral lesion or overstates the degree of canal stenosis. This opacity limits trust, complicates regulatory review, and ultimately slows the adoption of AI in oncologic imaging. It also slows down clinical adoption at scale. A total of 1978 expert-captioned studies—radiographs, CT, and MRI—were collected from the public ROCO-v2 corpus which also included images for spine. The proposed solution is an inference-time five-step Chain-of-Thought (CoT) reasoning that uses a fine tuned 7-billion-parameter vision–language model that guides the model to generate the following:Image-quality 2025 Sep 18-21; Baltimore, MD. Philadelphia (PA): AACR; Cancer Epidemiol Biomarkers Prev 2025;34(9 Suppl):Abstract nr C036.

Bookmark

Abstract C036: From pixels to prognosis: Attention-based chain of thought reasoning for automated spine cancer image analysis

Key Points

Abstract

Cite This Study