What question did this study set out to answer?

This research aims to develop a standardized framework for assessing the performance of single-cell perturbation models.

May 14, 2026Open Access

scArchon: a scalable benchmarking framework for assessing single-cell perturbation models

Key Points

This research aims to develop a standardized framework for assessing the performance of single-cell perturbation models.
Developed scArchon, a modular benchmarking platform using Snakemake.
Evaluated multiple perturbation response prediction tools across six RNA-seq datasets.
Assessed model performance with a mix of statistical and biological metrics.
trVAE, scGen, scPRAM, and scVIDR showed robust performance across datasets.
Some methods underperformed compared to linear or control baselines.
Quantitative scores did not always correlate with retention of biological perturbation signatures.

Abstract

Abstract Background The accurate prediction of cellular responses to perturbations, such as drug treatments, remains a pivotal challenge in single-cell transcriptomics. While numerous deep learning tools have been developed for this task, their systematic benchmarking across diverse datasets and performance metrics has been limited. Results Here, we present scArchon, a reproducible, modular benchmarking platform built on Snakemake. It is designed to evaluate perturbation response prediction tools in an unbiased and extensible manner. Employing six representative single-cell RNA-seq datasets, we compare leading methods such as scGen, CPA, trVAE, scPRAM, scVIDR, scDisInFact, SCREEN, scPreGAN, and CellOT against baselines. We assess model performance using a composite of statistical and biological metrics. Our analysis reveals heterogeneous performance. While methods like trVAE, scGen, scPRAM, and scVIDR achieve robust results across multiple datasets, other tools occasionally underperform even compared to linear or control baselines. Notably, models with favorable quantitative scores may fail to retain key biological perturbation signatures, underscoring the need for gene-level evaluation. Conclusions scArchon provides a unified, extensible foundation for large-scale, standardized benchmarking of perturbation prediction tools, facilitating methodological transparency and accelerating development in this rapidly evolving field. We encourage adoption of scArchon and sharing of containerized tools to drive progress in single-cell perturbation modeling.

Ask AI

Mark Helpful

Bookmark

Relay

View Full Paper