Abstract Objective: To build a reproducible digital slide archival pipeline and tissue image repository for genitourinary cancers using archival slides from a large cancer research organization for digital pathology applications and biomarker discovery. Background: Digital pathology applies computer vision to digitized H0.001% damage rate, (iii) high-fidelity scanning using hardware capable of penetrating dirt/film layers commonly found on aging slides, (iv) a real-time, scalable review process via the HistoWiz PathologyMap platform and (v) optical character recognition (OCR) for handwritten slide labels. Results: Archived slides from over 20 years (2001 to 2025) of slide collection were scanned at 900 slides per day, using two high-throughput scanning clusters with a 90% initial QC pass rate. A representative subset of images was independently evaluated by a pathologist for issues including tissue folds, cracking, air bubbles, blebs, ink annotations, and out-of-focus regions. We found that most QC failures were caused by four recurring issues: tissue folding from microtomy (30%), cover-slip issues (10%), mounting media residue (2%), and handwritten annotations or processing artifacts (1.5-12.5%). Corrective measures, including rescanning, xylene- or alcohol-based coverslip cleaning, and recoverslipping, resolved 95% of cases with cover-slip or mounting media issues, with limited tissue loss. Final slides were stored as pyramidized OME-TIFF files at 0.248 µm/pixel, averaging 2 GB per slide. In an offline Python pipeline, a Google Vision-derived OCR model yielded superior barcode read rate and accuracy on histology slides, outperforming Microsoft’s OCR, Tesseract (v4+), and a Keras-based CRNN baseline. Conclusion: Our large-volume and high-throughput WSI archival workflow delivers a scalable imaging pipeline that digitized a 20-year academic tissue biobank and provides on-demand access to AI-ready, high-quality quality full-resolution slide images. This digital image repository will support large-scale digital pathology research to discover biomarkers in genitourinary cancer. Citation Format: Faria Kabir, Mitra Shavakhi, Akash Parvatikar, Adrien Cesaire, Egypt Phillips, Rachel Trowbridge, Liliana Ascione, Pablo Barrios, Marc Eid, Jasmine Lee, Sabina Signoretti, Eliezer Van Allen, Atish Choudhury, Linh Hoang, Toni K. Choueiri, Jeremiah Wala. A high-throughput slide scanning pipeline for digitizing and standardizing legacy genitourinary cancer slides abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2026; Part 1 (Regular Abstracts); 2026 Apr 17-22; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2026;86(7 Suppl):Abstract nr 1452.
Kabir et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: