What question did this study set out to answer?

The aim is to create a unified model for interpreting gene expression and histology in lung cancer.

April 5, 2026

Abstract 2754: VGL: Vision-Gene-Language multimodal LLM integrating histopathology and gene expression for cell type classification in lung cancer.

Key Points

The aim is to create a unified model for interpreting gene expression and histology in lung cancer.
Developed a multimodal LLM based on Vision-Gene-Language (VGL) framework.
Trained on 5.2 million multimodal samples including histological patches and gene expression profiles.
Utilized multi-task learning for image-to-gene, gene-to-cell type, and related tasks.
Employed QLoRA for parameter-efficient training on 4 x H100 GPUs.
Achieved 70.07% accuracy for cell type classification compared to 16.32% for a naive model.
Validation showed 69.85% accuracy, indicating strong generalization.
Demonstrated effective use of cross-modal learning and consistent biological outputs.

Abstract

Abstract Background Understanding the tumor microenvironment requires models that resolve cellular heterogeneity across molecular and spatial modalities. With the expansion of spatial transcriptomics, single-cell RNA-seq, and high-resolution histopathology imaging, there is a need for a unified foundation model that jointly interprets gene expression, spatial context, and visual tissue features. We developed a multimodal large language model (LLM) that integrates these modalities into a single adaptive framework handling heterogeneous inputs—including gene expression profiles, spatial transcriptomics spots, single-cell measurements, and histology patches—while generating harmonized outputs such as genes, cell types, and image-derived descriptors. Method We built a multimodal LLM within a Vision-Gene-Language (VGL) framework that integrates gene expression, histology images, and biological language representations. The model is based on MedGemma-4b-it and was fine-tuned using QLoRA for parameter-efficient training. Training used 5.2 million multimodal samples of H Part 1 (Regular Abstracts); 2026 Apr 17-22; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2026;86(7 Suppl):Abstract nr 2754.

Bookmark

Abstract 2754: VGL: Vision-Gene-Language multimodal LLM integrating histopathology and gene expression for cell type classification in lung cancer.

Key Points

Abstract

Cite This Study