Introduction: Identifying post-translational modification sites in Tripterygium wilfordii proteins was essential for understanding their pharmacological mechanisms. Traditional biological methods for this identification were time-consuming and labor-intensive. However, recent advancements in deep learning provided opportunities for developing computational approaches that are both time-efficient and cost-effective. To leverage the diverse information in protein sequence data, innovative computational methods were needed to handle complex and imbalanced datasets. Method: The attention mechanism DAAM employed for the task of PTM site prediction, combined with convolutional neural networks, proposes a model, TriPTMDAAMCNN, capable of dynamically adjusting feature information. In this process, we unfolded the one-dimensional sequence information into images for fitting. Results: The model was applied to nine feature datasets and compared with advanced methods. The proposed TriPTMDAAMCNN classification model outperformed other classical network models on BLOSUM Tripterygium wilfordii protein features, achieving an accuracy of 84. 65%, a Matthews Correlation Coefficient of 0. 6646, and an F1 score of 0. 7824. In the ablation experiments, the convolutional neural network demonstrated effective extraction of local features, and the DAAM attention mechanism exhibited excellent adaptability for the PTM site prediction task. In experiments investigating the impact of increasing the number of convolutional channels on the model, several models exhibited a pattern of initial decrease followed by an increase, which we hypothesize is related to the flow pattern of protein sequence feature information within the network model. Discussion: Our model achieved this goal by constructing multiple base models, and the overall approach could be divided into two parts. One part involved combining different functional attention mechanisms with a CNN model, utilizing the CNN's capacity to extract rich local information while employing attention mechanisms to eliminate redundant information. Conclusion: TriPTMDAAMCNN emphasized the importance of dynamically integrating local and global structural information when predicting PTM sites, while also supporting multiparameter tuning to assess the stability of the model.
Tang et al. (Wed,) studied this question.