What is the clinical evidence from this study?

Study design: Cross-Sectional. Population: Cancer (n=1642). Intervention: LLM-driven framework (RAG architecture with Mistral 7B) vs. Quantitative analyses. Primary outcome: Analytical concordance with quantitative analyses across 23 hypotheses.

What question did this study set out to answer?

The study aims to evaluate the effectiveness of an AI-driven framework in analyzing cancer burden from patient narratives.

May 29, 2026

Representing people's experience of cancer and its treatment (ResPECT) pilot study: A global collaborative study to improve cancer care and outcomes as part of the Lancet Commission on Cancer and Health Systems.

Key Result

An LLM-driven framework analyzing unstructured patient narratives reproduced quantitative findings with 91.3% concordance, 90% contextual correctness, and a 3% hallucination rate.

Key Points

The study aims to evaluate the effectiveness of an AI-driven framework in analyzing cancer burden from patient narratives.
Participants provided both structured burden ratings and open-text narratives regarding cancer impacts.
A retrieval augmented generation architecture was employed to analyze unstructured text.
Performance evaluation focused on concordance, contextual correctness, and hallucination rate.
The LLM-based analysis matched quantitative findings in 21 of 23 hypotheses, achieving 91.3% concordance.
Psychological impacts were reported significantly higher than physical, social, or financial burdens (all p≤0.001).
Contextual correctness was 90%, with hallucinations appearing in only 3% of outputs.

Study Design

Type

Cross-Sectional (n=1,642)

Multicenter

Yes

Structured PICO

Can an LLM-driven framework reliably assess multidimensional cancer burden from unstructured patient narratives compared to structured quantitative ratings?

Population

1,642 adult cancer patients and survivors, 66.9% female, 22.0% identifying as part of an underrepresented group, from 23 countries.

Intervention

LLM-driven framework using a retrieval augmented generation (RAG) architecture (Nomic Embed, Mistral 7B, ensemble retriever) to analyze unstructured patient narratives.

Comparator

Quantitative analyses of structured burden ratings (Likert scale, 1-10).

Outcome

Analytical concordance with quantitative analyses across 23 hypotheses, contextual correctness, and hallucination rate over 50 questions.

An LLM-driven framework can reliably extract and scale qualitative insights regarding cancer burden from unstructured patient narratives, achieving high concordance with structured quantitative assessments.

Limitations

The current pilot is not representative of cancer patients worldwide

Abstract

1592 Background: Measurement of cancer burden increasingly informs survivorship research, health system design, and policy. Existing approaches rely on structured patient-reported outcome measures, which are costly to administer, culturally constrained, and difficult to scale globally. Advances in artificial intelligence (AI), particularly large language models (LLMs), offer the opportunity to systematically analyze unstructured patient narratives at scale. The ResPECT pilot study evaluates whether an LLM-driven framework can assess multidimensional cancer burden from patient-reported free text. Methods: Adult cancer patients and survivors provided structured burden ratings (Likert scale, 1–10) and open-text narratives describing physical, psychological, social, financial, and other impacts of cancer and its treatment. Unstructured text was analyzed using a retrieval augmented generation (RAG) architecture combining dense semantic embeddings (Nomic Embed), an open-weight LLM (Mistral 7B), and an ensemble retriever integrating vector similarity search and BM25 retrieval. Performance was evaluated across three predefined domains: (1) analytical concordance with quantitative analyses across 23 hypotheses as well as (2) contextual correctness and (3) hallucination rate over 50 questions, assessed by two independent reviewers. Results: A total of 1,642 participants from 23 countries were analyzed; 66.9% identified as female and 22.0% identified themselves to be part of an underrepresented group. Overall, impact on psychological well-being was ranked significantly higher compared to physical, social, or financial well-being (all p≤0.001). Younger participants and individuals identifying with underrepresented groups reported significantly higher overall and domain-specific burden, while U.S. participants reported greater financial burden than UK participants (p≤0.001). The LLM-based analyses reproduced quantitative findings in 21 of 23 hypotheses (91.3% concordance). Contextual correctness was 90% (90 of 100), and hallucinations occurred in 3% (3 of 100) of generated responses, predominantly involving minor paraphrasing of patient quotations. The LLM consistently identified key themes, particularly related to psychological well-being, including emotional stress, depression, relationship strain, isolation, guilt, and hope. Conclusions: The ResPECT initiative demonstrates that an LLM-driven framework can reliably provide and effectively scale qualitative insights into cancer burden from a patient perspective. While we acknowledge that the current pilot is not representative of cancer patients worldwide, the ResPECT approach may be scalable to derive comparable and actionable insights to inform investment in cancer services around the world.

Bookmark