This work presented the first systematic investigation of data leakage and generalization in mRNA-protein interaction prediction, demonstrating that most reported near-perfect performance is largely driven by RBP overlap between training and test sets. By introducing an RBP-aware evaluation framework and a benchmark dataset, we revealed that most sequence-based models fail to generalize to unseen RBPs, even when enhanced with protein language model-derived and structure-aware encodings. Our study established a more rigorous evaluation standard for mRNA-protein interaction prediction, highlighting the critical need for protein diversity and beyond-sequence features to advance reliable mRNA-protein interaction prediction.
Yu et al. (Tue,) studied this question.