Introduction N 6 -methyladenosine (m 6 A) is a pivotal RNA modification involved in diverse biological and pathological processes. Compared to the m 6 A detection methods based on second-generation sequencing, Nanopore direct RNA sequencing (DRS) offers the unique advantage of capturing native modifications. Methods Here, we present Nanopore-m 6 A-Finder (NP-mFinder), a reference-free m 6 A prediction computational framework that employs the XGBoost model in the mRNA exonic region and a hard-voting ensemble of XGBoost and random forest models in the poly(A) region. Results and discussion NP-mFinder can determine m 6 A sites as well as estimate their methylation levels from Guppy basecalled DRS data. After training with DRS data of in vitro -transcribed RNA, NP-mFinder achieved high performance on held-out test datasets (area under the curve (AUC) ≈0.90; accuracy, precision, recall, and F1-score 0.80). Comparing with canonical m6A detection methods, it recovered 20% of meRIP-seq-defined m6A sites in yeast, and 27% of our HEK293 site prediction overlapped with miCLIP calls. Although single-base overlap with existing DRS-based tools of EpiNano and mAFiA was limited, 73% of our identified m 6 A-containing genes were validated by at least one of them. Benchmarking our method with GLORI v2.0 revealed concordance of 28% at a site level and 85% at a gene level, as well as a mild correlation on m 6 A level estimations. Notably, NP-mFinder achieved 93% precision in detecting m 6 A within the “AAAAA” sequence context in the mRNA exonic region of HEK293T DRS data when compared to high-confidence m 6 A site annotation in GLORI v2.0, demonstrating the good performance of our method in the region possessing a stretch of continuous A-sequences. Moreover, our method predicted that m6A might exist in the human HEK293 poly(A) region, suggesting a possibly conserved phenomenon of a modified poly(A) tail beyond the previously reported T. brucei variant surface glycoprotein (VSG) transcripts. Together, these results established NP-mFinder as a robust and versatile tool for transcriptome-wide m6A profiling with DRS data at single-read resolution.
Yang et al. (Tue,) studied this question.