What question did this study set out to answer?

This research aims to explore the effectiveness of machine learning models in classifying antimicrobial peptide sequences.

February 14, 2026Open Access

Machine Learning Models for Antimicrobial Peptide Classification Using k-mer Sequence Representations

Key Points

This research aims to explore the effectiveness of machine learning models in classifying antimicrobial peptide sequences.
Applied baseline machine learning models for classification
Used k-mer bag-of-words to convert peptide sequences into numerical features
Trained logistic regression and random forest classifiers
Evaluated model performance using confusion matrix analysis and ROC curves
Model performance was nearly equivalent to random guessing with accuracy around 50%
ROC area under the curve values were close to 0.5
Indicates that k-mer sequence features alone are inadequate for reliable classification of antimicrobial peptides

Abstract

Antimicrobial peptides (AMPs) are short amino acid sequences that play a critical role in immune defenses and have gained attention as potential alternatives to traditional antibiotics. Due to the difficulty of experimentally identifying new AMPs, machine learning approaches have been explored as a method for predicting antimicrobial activity from peptide sequences. In this study, baseline machine learning models were applied to classify peptide sequences as antimicrobial or non-antimicrobial using simple sequence-based feature representations. Peptide sequences were converted into numerical features using a k-mer bag-of-words approach and used to train logistic regression and random forest classifiers. Model performance was evaluated on a held-out test set using confusion matrix analysis and receiver operating characteristic (ROC) curves. Both models demonstrated performance close to random guessing, with accuracy values near 50% and ROC area under the curve values close to 0.5. These results indicate that baseline machine learning models using k-mer sequence features alone are insufficient for reliably predicting antimicrobial peptides. This study highlights the need for more advanced feature representations and modeling approaches to improve predictive performance in antimicrobial peptide classification.

Bookmark

View Full Paper

Bookmark

View Full Paper

Machine Learning Models for Antimicrobial Peptide Classification Using k-mer Sequence Representations

Key Points

Abstract

Cite This Study