DNA contigs binning is necessary to reconstruct metagenome-assembled genomes. Current metagenomic DNA contigs binning methods often leverage coverage profiles across multiple related metagenomes and have demonstrated strong performance on co-assembled contigs. However, in single-sample scenarios where coverage information is rare, their performance drops significantly, limiting the in-depth development of metagenomics at the individual sample level. To address this issue, we propose DCVBin, a novel single-sample metagenomic contigs binning method that incorporates semantic features extracted from a DNA language model. Specifically, our approach continues pretraining on a DNA language model to capture more domain-specific semantic representations, which are then integrated with 4-mer frequencies using a variational autoencoder. Clustering is subsequently performed using the k-means algorithm, in which the number of clusters is determined by single copy genes. Experimental results on six publicly available datasets demonstrate that DCVBin achieves high-accuracy single-sample metagenomic binning and outperforms other state-of-the-art methods. Furthermore, DCVBin is included into a disease diagnostic framework that is evaluated on a cohort of gut metagenomes from people with colorectal cancer and healthy people. The framework is shown to be accurate in predicting colorectal cancer using gut metagenomes and has identified a list of potential microbial biomarkers.
Building similarity graph...
Analyzing shared references across papers
Loading...
Yì Wáng
University of Stuttgart
Yifan Liu
Jilin University
F M Liu
Union Hospital
Briefings in Bioinformatics
Jilin University
Jilin Jianzhu University
Jilin Engineering Normal University
Building similarity graph...
Analyzing shared references across papers
Loading...
Wáng et al. (Fri,) studied this question.
synapsesocial.com/papers/6a0ea17cbe05d6e3efb60294 — DOI: https://doi.org/10.1093/bib/bbag241
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: