One of the critical prerequisites for efficient PCR-based diagnosis in a molecular laboratory is accurate primer design. The conventional primer design process relies on rule-based algorithms to provide primer features without capturing the complex interactions between various properties to determine primer functionality. Also, the manual inspection of each feature is difficult and time-consuming. Here, we present the Primer Assessment & Scoring Tool (PrimerAST), a pipeline with a machine learning model for automatically designing primers and evaluating the functional efficiency for PCR applications. This tool was developed by integrating experimentally designed primers (labelled as verified) and synthetically generated primer pairs (labelled as predicted to fail) (N = 316) to create the dataset. A total of 16 different features generated during the process of primer design were used as engineered features. Four supervised machine learning models were examined with 10-fold cross-validation. Upon training and testing, the support vector machine and gradient boosting models showed the highest performance based on six evolution metrics on independent testing datasets, with mean area-under-the-curve (AUC) values ranging from 0.96 to 0.99 across folds. This suggests robustness of PrimerAST to capture the various interactions between primer features to predict the efficiency of primer design. The PrimerAST is an available online tool (https://primerast.streamlit.app/) that can significantly enhance PCR primer design by providing machine-learning–guided evaluation of several design features.
Al-Mahrami et al. (Wed,) studied this question.