This deposition contains the analysis code and supporting data used inthe manuscript "Chemically-induced skin tumors arise from long-livedstem cells of the upper hair follicle" by Kandyba et al. (Science, provisionally accepted; manuscript ID adv8291). The repository is organized by analysis/figure, with each subfoldercontaining code from a specific contributor along with a READMEdescribing inputs, outputs, and how to run the code: • HNSCCₐnalysis/ — single-cell RNA-seq analysis of head and neck squamous cell carcinoma (GEO: GSE181919, Choi et al. ). Standard Seurat workflow projecting the mouse Lgr6 carcinoma metagene onto HNSCC patients via mouse-to-human ortholog conversion (MGI), with metagene scoring by AddModuleScore. Used in Fig. S5. (R; author: Andrea Curtabbi) • humancSCCₐnalysis/ — Monocle3 analysis of human cutaneous squamous cell carcinoma (GEO: GSE144236, Ji et al. ) and the matched mouse cSCC dataset (GEO: GSE261766). PCA (100 components), UMAP (cosine; mindist=0. 1, nₙeighbors=15), Leiden clustering, patient-level alignment via aligncds, and projection of the Lgr6 carcinoma metagene and the Tsk signature from Ji et al. Used in Fig. S5. (R; author: Mark Taylor) • cellₒfₒriginₛcRNAseq/ — single-cell RNA-seq analysis of mouse Lgr6CreER-eGFP backskin (GEO: GSE261766) generated for this study, used to identify the cell of origin. CellRanger 3. 1 + Seurat 3. 2 pipeline with QC filtering (200–3000 genes/cell, <5% mitochondrial), yielding 3, 163 cells × 12, 768 genes; HVG selection, PCA, UMAP (top 15 PCs), and Louvain clustering. Used in Fig. 4 and Figs. S9–S11, S17–S18. (R; author: Yun Rose Li) • figure₆Bₕeatmap/ — self-contained reproducibility package for Fig. 6B: a log-scale heatmap of mutated cells per million across treatment groups, broken down by RAS mutation (Hras/Kras/Nras at codons 12, 13, and 61). Includes the visualization script, input data files (per-sample mutation calls and sequencing depths produced by deepUMIcaller and deepCSA), and reference output figures. (Python; author: Ferriol Calvet) Software requirements: R ≥ 4. 1 with Seurat, Monocle 3, and supportingpackages for the R analyses (cell-of-origin scripts originally runwith Seurat 3. 2 specifically) ; Python ≥ 3. 9 with pandas, matplotlib, seaborn, and numpy for the Fig. 6B heatmap. See the top-levelREADME. md for the complete dependency list and per-folder READMEs forcontribution-specific notes. The deposition is released under the Creative Commons Attribution 4. 0International (CC-BY-4. 0) license.
Kandyba et al. (Fri,) studied this question.