Many protein–protein interactions (PPIs) are mediated by the binding of short linear motifs (SLiMs) to peptide recognition domains (PRDs). Here, we describe PrePPI-SLiM, a proteome-scale computational pipeline that leverages data from the Eukaryotic Linear Motif (ELM) database to predict whether two proteins will form a peptide-mediated complex. The ELM database defines classes of protein–peptide interactions with SLiMs represented by sequence motifs and PRDs represented by Pfam domains. PrePPI-SLiM systematically evaluates all pairwise combinations of proteins within a proteome and identifies PRD–SLiM pairs that occur in the same ELM class. This evidence together with disorder prediction and sequence conservation of the motif are integrated in a naïve Bayes framework to assign a likelihood for complex formation. To obtain potential PDB templates for atomistic models of PrePPI–SLiM interactions, we associate individual PPI predictions with homologous PDB complexes involving the same PRD Pfam domain and SLIM, and obtain PDB templates for 92% of our high-confidence predictions. Moreover, studies with AF3Complex suggest that prior knowledge of the interacting PRD and SLiM, as provided here, is a critical starting point for creating a 3D model of the specific sequences of the PRD and SLiM query proteins. Finally, we demonstrate that clustering of the high-confidence PrePPI-SLiM interactome yields functionally coherent PPI networks that reveal mechanistic insights into cellular processes. The PrePPI webserver provides convenient access to high-confidence PrePPI-SLiM predictions, PDB templates for modeling, and functional networks.
Saha et al. (Mon,) studied this question.