Key points are not available for this paper at this time.
Integrating whole-slide images (WSIs) and bulk tran-scriptomics for predicting patient survival can improve our understanding of patient prognosis. However, this multi-modal task is particularly challenging due to the different nature of these data: WSIs represent a very high-dimensional spatial description of a tumor, while bulk tran-scriptomics represent a global description of gene expression levels within that tumor. In this context, our work aims to address two key challenges: (1) how can we tokenize transcriptomics in a semantically meaningful and interpretable way?, and (2) how can we capture dense multi-modal interactions between these two modalities? Here, we propose to learn biological pathway tokens from transcriptomics that can encode specific cellular functions. Together with histology patch tokens that encode the slide morphology, we argue that they form appropriate reasoning units for interpretability. We fuse both modalities using a memoryefficient multimodal Transformer that can model interactions between pathway and histology patch tokens. Our model, Survpath, achieves state-of-the-art performance when evaluated against unimodal and multimodal baselines on five datasets from The Cancer Genome Atlas. Our interpretability framework identifies key multimodal prognostic factors, and, as such, can provide valuable insights into the interaction between genotype and phenotype. Code available at https://github.com/mahmoodlab/SurvPath.
Building similarity graph...
Analyzing shared references across papers
Loading...
Guillaume Jaume
Anurag Vaidya
Richard J. Chen
Mass General Brigham
Building similarity graph...
Analyzing shared references across papers
Loading...
Jaume et al. (Sun,) studied this question.
www.synapsesocial.com/papers/6a0cad9b6ee14e9a1e88689c — DOI: https://doi.org/10.1109/cvpr52733.2024.01100