Abstracts Machine learning (ML) is increasingly applied in cheminformatics to predict physico-chemical and biological properties from structural descriptors. In this work, eight ML algorithms, Linear Regression, Random Forest, XGBoost, LightGBM, CatBoost, K-Nearest Neighbours (KNN), Support Vector Regression and Gradient Boosting, were benchmarked for predicting binding free energies of compounds targeting transforming growth factor-β receptor I (TGFβR1). Among them, KNN achieved the best predictive accuracy (RMSE = 1.023, R = 0.704, mean absolute error = 0.773) and was employed for virtual screening of the Endophytic Microorganism Natural Product Database. Fourteen candidates below −12 kcal mol−1 were prioritized. Docking validation confirmed reliability (root mean square deviation 2 Å), and five predicted hits displayed favourable docking scores. Molecular dynamics (MD) simulations over 100 ns revealed stable protein–ligand complexes, with interactions involving LYS232, LEU340, VAL219 and ILE211. To enhance energy prediction, steered MD (SMD) was applied, yielding stronger correlation with experimental data (R = −0.7201) compared with docking (R = 0.5404). Nigrosporolide showed the greatest mechanical stability (ΔGSMD = −8.125 kcal mol−1, ⟨Fmax⟩ = 852.78 pN). Toxicity profiling predicted median lethal dose (LD50) values from 340 to 9000 mg kg−1 with variable organ-specific risks. Overall, integrating ML, docking, SMD and toxicity analysis enabled effective identification of TGFβR1 inhibitors, highlighting nigrosporolide and trichocadinin E as promising leads for further study.
Xuan et al. (Wed,) studied this question.