March 3, 2026Open Access

SCANBIT facilitates identification of tumor cell populations in scRNAseq data using pseudobulked SNV calls

Key Points

Identification of tumor cells improved through variant calling methods, enhancing understanding of tumor biology.
Pseudobulking data from transcriptionally similar clusters resulted in high-quality variant calls and robust analyses.
Hierarchical clustering and bootstrapping validated high confidence in distinguishing tumor from normal cells.
Application of this technique to human samples highlights its potential to reveal biological vulnerabilities for treatment development.

Abstract

We characterized the limitations inherent to calling variants from scRNAseq data, quantifying how data sparsity precludes genetic distance calculation between single cells. As a novel workaround, we pooled data from transcriptionally similar cell clusters to call high quality variants and then calculated pairwise differences between cell populations and performed hierarchical clustering. We quantified confidence in genetic divergence between tumor and normal cell populations using bootstrapping. We performed extensive validation to assess accurate identification of tumor cells using ground-truth datasets. Application of our method to human scRNAseq samples highlighted the utility of our approach and revealed how mutational burden influences successful tumor cell identification.Improved cell type assignment in scRNAseq data will facilitate analysis of tumor samples and, in turn, accelerate our understanding of the mechanisms underlying tumor progression and reveal potential biological vulnerabilities that can be exploited to develop improved treatment options.

Bookmark

View Full Paper

Cite This Study

Cannon et al. (Wed,) studied this question.

synapsesocial.com/papers/69a75dd4c6e9836116a28139 https://doi.org/https://doi.org/10.64898/2026.01.27.701834

Bookmark

View Full Paper