Abstract The Single-cell Pediatric Cancer Atlas (ScPCA) Portal (https://scpca.alexslemonade.org/), developed and maintained by the Childhood Cancer Data Lab, is a data resource for uniformly processed single-cell and single-nuclei RNA sequencing data, as well as de-identified metadata from pediatric tumor samples. Originally comprised of data from 10 projects funded by Alex’s Lemonade Stand Foundation (ALSF), the Portal currently contains summarized gene expression data for over 700 samples across more than 50 cancer types drawn from ALSF-funded and community-contributed datasets. Downloads include gene expression data as SingleCellExperiment or AnnData objects containing raw and normalized counts, PCA and UMAP coordinates, and summary reports. Some samples have additional data from bulk RNA-seq, spatial transcriptomics, and/or feature barcoding (e.g., CITE-seq and cell hashing) included in the download. All data on the Portal were uniformly processed using scpca-nf, an efficient and open-source Nextflow workflow written and maintained by the Data Lab, which utilizes alevin-fry to quantify gene expression. Since presenting the ScPCA Portal at the 2024 AACR Annual Meeting, several new features have been added to the available data. Automated cell type annotation is now performed using three unique methods: SingleR, CellAssign, and SCimilarity. If two of the three methods agree, an ontology-aware consensus cell type label is assigned. The individual annotations and the consensus cell types are included in the cell metadata of the downloaded objects. Some projects also include manually-curated cell type annotations generated as part of the OpenScPCA project (https://openscpca.readthedocs.io). In addition, copy-number variation (CNV) inference is now performed on each sample using the InferCNV package, specifying the i6 HMM to quantify specific CNV events. Since InferCNV quantifies CNV events using a designated set of normal, or non-malignant, reference cells, consensus cell types are used to identify a diagnosis-appropriate normal cell reference for each sample. The total CNVs observed and the full HMM metadata table are stored in the processed SingleCellExperiment and AnnData objects. The updated cell type annotation and implementation of InferCNV are included as part of the open-source workflow, scpca-nf. The workflow and associated documentation are freely available at https://github.com/AlexsLemonade/scpca-nf. Finally, the ScPCA Portal hosts an instance of the UCSC Cell Browser, enabling users to visualize and interact with the gene expression data for all samples without needing to download the data. Comprehensive documentation about data processing and the contents of files on the portal, including a guide to getting started working with an ScPCA dataset, can be found at https://scpca.readthedocs.io. Citation Format: Allegra G. Hawkins, Joshua A. Shapiro, Stephanie J. Spielman, David S. Mejia, Deepashree Venkatesh Prasad, Nozomi Ichihara, Arkadii Yakovets, Avrohom M. Gottlieb, Kurt G. Wheeler, Chanté J. Bethell, Steven M. Foltz, Jennifer O'Malley, Casey S. Greene, Jaclyn N. Taroni. Improving the utility of the single-cell pediatric cancer atlas through updated cell type annotations, CNV inference, and visualization tools abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2026; Part 1 (Regular Abstracts); 2026 Apr 17-22; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2026;86(7 Suppl):Abstract nr 3498.
Hawkins et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: