What question did this study set out to answer?

The aim is to present pAnno, a workflow designed to discover novel coding regions through proteogenomic integration.

June 4, 2026Open Access

pAnno: a comprehensive, precise, and fast proteogenomic workflow for the discovery of novel coding regions

Key Points

The aim is to present pAnno, a workflow designed to discover novel coding regions through proteogenomic integration.
Developed an end-to-end workflow integrating genomic, transcriptomic, and proteomic data.
Employed a multi-stage iterative open search strategy for peptide identification.
Utilized an efficient peptide-to-coding sequence mapping algorithm.
Identified 1.73 times more novel proteins in Pyrus, with high sensitivity and accuracy.
Detected 34 times more non-canonical HLA peptides in lung cancer cases.
Achieved genomic localization of novel events with only ~3% additional processing time.

Abstract

Proteogenomics is a transformative approach for deciphering novel coding regions through integration of genomic, transcriptomic, and proteomic data. Here, we present pAnno, an end-to-end workflow designed to uncover hidden protein-coding elements with high precision and efficiency. pAnno generates customized protein databases by integrating multi-omic data, employs a multi-stage iterative open search strategy, and incorporates an efficient peptide-to-coding sequence mapping algorithm. Despite a 50-fold increase in database size, pAnno maintains high sensitivity and accuracy in peptide identification and achieves genomic localization of novel events with only \ (\) 3% additional processing time, delivering unprecedented resolution and speed in proteogenomic analysis. By detecting splicing, mutations, and novel protein isoforms, pAnno supports various downstream applications and reveals overlooked events, identifying 1. 73 \ (\) more novel proteins in Pyrus and 34 \ (\) more non-canonical HLA peptides in lung cancer. These capabilities position pAnno as a gold-standard proteogenomic workflow, excelling in non-canonical coding discovery and large-scale database processing.

Bookmark

View Full Paper

Bookmark

View Full Paper

pAnno: a comprehensive, precise, and fast proteogenomic workflow for the discovery of novel coding regions

Key Points

Abstract

Cite This Study