Key points are not available for this paper at this time.
Extensive efforts have been devoted to determining the binding specificity of Src homology 3 (SH3) domains usually in a case-by-case manner. A generic structure-based model is necessary to decipher the protein recognition code of the entire domain family. In this study, we have developed a general framework that combines molecular modeling and a machine learning algorithm to capture the energetic characteristics of the domain-peptide interactions and predict the binding specificity of the SH3 domain family. Our model is not trained for individual SH3 domains; rather it is a generic model for the entire domain family. Our model not only achieved satisfactory prediction accuracy but also provided structural insights into which residues are important for the binding specificity. The success of our framework on SH3 domains suggests that it is possible to establish a theoretical model to decipher the protein recognition code of any modular domain. Extensive efforts have been devoted to determining the binding specificity of Src homology 3 (SH3) domains usually in a case-by-case manner. A generic structure-based model is necessary to decipher the protein recognition code of the entire domain family. In this study, we have developed a general framework that combines molecular modeling and a machine learning algorithm to capture the energetic characteristics of the domain-peptide interactions and predict the binding specificity of the SH3 domain family. Our model is not trained for individual SH3 domains; rather it is a generic model for the entire domain family. Our model not only achieved satisfactory prediction accuracy but also provided structural insights into which residues are important for the binding specificity. The success of our framework on SH3 domains suggests that it is possible to establish a theoretical model to decipher the protein recognition code of any modular domain. Protein-protein interactions play a central role in the cell and are often mediated by the weak and transient interactions between peptides and modular domains (1Kay B.K. Williamson M.P. Sudol P. The importance of being proline: the interaction of proline-rich motifs in signaling proteins with their cognate domains.FASEB J. 2000; 14: 231-241Crossref PubMed Scopus (1037) Google Scholar, 2Pawson T. Nash P. Assembly of cell regulatory systems through protein interaction domains.Science. 2003; 300: 445-452Crossref PubMed Scopus (1143) Google Scholar, 3Castagnoli L. Costantini A. Dall’armi C. Gonfloni S. Montecchi-Palazzi L. Panni S. Paoluzi S. Santonico E. Cesareni G. Selectivity and promiscuity in the interaction network mediated by protein recognition modules.FEBS Lett. 2004; 567: 74-79Crossref PubMed Scopus (63) Google Scholar). The most abundant peptide recognition domain in the human proteome is the Src homology 3 (SH3) 1The abbreviations used are: SH3, Src homology 3; SVM, support vector machine; MIEC, molecular interaction energy component; MM/GBSA, molecular mechanics/generalized Born solvent area; MD, molecular dynamics; GB, generalized Born; TP, true positive; FP, false positive; TN, true negative; FN, false negative; MM/PBSA, molecular mechanics-Poisson-Boltzmann solvent area; PB, Poisson-Boltzmann; SASA, solvent-accessible surface area; RBF, radial basis function; SE, sensitivity; SP, specificity. 1The abbreviations used are: SH3, Src homology 3; SVM, support vector machine; MIEC, molecular interaction energy component; MM/GBSA, molecular mechanics/generalized Born solvent area; MD, molecular dynamics; GB, generalized Born; TP, true positive; FP, false positive; TN, true negative; FN, false negative; MM/PBSA, molecular mechanics-Poisson-Boltzmann solvent area; PB, Poisson-Boltzmann; SASA, solvent-accessible surface area; RBF, radial basis function; SE, sensitivity; SP, specificity. domain (4Mayer B.J. SH3 domains: complexity in moderation.J. Cell Sci. 2001; 114: 1253-1263Crossref PubMed Google Scholar) that recognizes proline-rich peptides with a core motif of PXXP (P is a proline and X is any amino acid) (5Ren R.B. Mayer B.J. Cicchetti P. Baltimore D. Identification of a 10-amino acid proline-rich SH3 binding site.Science. 1993; 259: 1157-1161Crossref PubMed Scopus (1018) Google Scholar, 6Lim W.A. Richards F.M. Fox R.O. Structural determinants of peptide-binding orientation and of sequence specificity in SH3 domains.Nature. 1994; 372: 375-379Crossref PubMed Scopus (448) Google Scholar). Peptides can bind to SH3 domains in two opposite orientations and are referred as class I and II peptides, which often contain +XXPXXP and PXXPX+ (where X refers to any residue and + refers to a positively charged residue) motifs, respectively. The binding specificity of an SH3 domain is determined by the amino acids in the flanking regions of the core motif, which has been investigated extensively for individual domains. However, a universal model was lacking to decipher the protein recognition code of the SH3 domain family. A generic model for the entire domain family needs to 1) provide a general framework to characterize the domain-peptide interaction and 2) reliably predict the binding specificity of each member in the domain family. Previous experimental and computational studies can only satisfy one of these requirements. For example, peptide library and peptide or protein array technologies are commonly used to determine the peptide motifs recognized by a domain, often represented as a position-specific scoring matrix (7Pisabarro M.T. Serrano L. Rational design of specific high-affinity peptide ligands for the Abl-SH3 domain.Biochemistry. 1996; 35: 10634-10640Crossref PubMed Scopus (110) Google Scholar, 8Rickles R.J. Botfield M.C. Weng Z.G. Taylor J.A. Green O.M. Brugge J.S. Zoller M.J. Identification of Src, Fyn, Lyn, PI3K and Abl SH3 domain ligands using phage display libraries.EMBO J. 1994; 13: 5598-5604Crossref PubMed Scopus (223) Google Scholar, 9Rickles R.J. Botfield M.C. Zhou X.M. Henry P.A. Brugge J.S. Zoller M.J. Phage display selection of ligand residues important for Src homology 3 domain binding specificity.Proc. Natl. Acad. Sci. U. S. A. 1995; 92: 10909-10913Crossref PubMed Scopus (175) Google Scholar, 10Sparks A.B. Rider J.E. Hoffman N.G. Fowlkes D.M. Quilliam L.A. Kay B.K. Distinct ligand preferences of Src homology 3 domains from Src, Yes, Abl, Cortactin, p53bp2, PLCγ, Crk, and Grb2.Proc. Natl. Acad. Sci. U. S. A. 1996; 93: 1540-1544Crossref PubMed Scopus (331) Google Scholar, 11Landgraf C. Panni S. Montecchi-Palazzi L. Castagnoli L. Schneider-Mergener J. Volkmer-Engert R. Cesareni G. Protein interaction networks by proteome peptide scanning.PLOS Biol. 2004; 2: 94-103Crossref Scopus (183) Google Scholar, 12Tong A.H.Y. Drees B. Nardelli G. Bader G.D. Brannetti B. Castagnoli L. Evangelista M. Ferracuti S. Nelson B. Paoluzi S. Quondam M. Zucconi A. Hogue C.W.V. Fields S. Boone C. Cesareni G. A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules.Science. 2002; 295: 321-324Crossref PubMed Scopus (581) Google Scholar, 13Stiffler M.A. Chen J.R. Grantcharova D. J.E. L.A. G. domain binding is the PubMed Scopus Google Scholar). have of the peptide the peptides in the usually only a of the possible peptides of a In the prediction of a sequence motif on of a domain is often that a of interaction L. C. J. The of recognition Biol. PubMed Scopus Google Scholar) also that a rather a of is to decipher the specificity of protein the as and by have been used to However, these often the weak and transient domain-peptide interactions T. T. M. for the in of the protein 2002; PubMed Scopus Google Scholar). computational have also been developed to predict the of modular domains prediction of cell signaling interactions using sequence 2003; PubMed Scopus Google Scholar, B. A. G. Cesareni G. an algorithm to predict ligands to of the SH3 Biol. 2000; PubMed Scopus Google Scholar, E. A. G. M. A structure-based for to the of SH3 domain PubMed Scopus Google Scholar, L. C. machine learning to protein for protein binding peptide PubMed Scopus Google Scholar, D. A model for the prediction of PubMed Scopus Google Scholar). For example, the a position-specific matrix on the in a of of and The matrix is used to the of a peptide binding to a specific SH3 domain. machine learning as network and support vector machine have been to predict binding peptides of SH3 domains on the in these usually a of interaction for SH3 domains the of possible of is In structural in the matrix is the of is not and the of are only by the amino acids into molecular modeling have been developed to the structural in a and the domain-peptide interaction on J.R. interaction of Biol. 2001; PubMed Scopus Google Scholar, Chen W.A. and prediction of the binding motif and protein of the Abl SH3 Biol. 2: Scopus Google Scholar, W.A. of binding of peptide recognition domains: an on and Biol. PubMed Scopus Google Scholar). structure-based usually not a of binding to the but the of the and the accuracy of the energy are for the success of these we have an that combines molecular modeling and to a model for the specificity of protein energy is the determining for an amino acid is a we used molecular interaction energy and between domain-peptide and residue to characterize the interaction of domain-peptide interaction a on the SH3 Biol. PubMed Scopus Google Scholar, J. of the using molecular interaction energy PubMed Scopus Google Scholar). each domain-peptide was from a by and this was using molecular the for and using molecular mechanics/generalized Born solvent The into a matrix that the energetic characteristics of the binding an was trained on the matrix to peptides into a or In the study, we this to predict the binding of SH3 domains that class I and experimental that our can establish a generic model of the protein recognition code of the SH3 domain not only the individual domain have SH3 domains that bind to class I peptides, Abl, Fyn, Lyn, Yes, and binding peptides for these SH3 domains in the R.J. Botfield M.C. Weng Z.G. Taylor J.A. Green O.M. Brugge J.S. Zoller M.J. Identification of Src, Fyn, Lyn, PI3K and Abl SH3 domain ligands using phage display libraries.EMBO J. 1994; 13: 5598-5604Crossref PubMed Scopus (223) Google Scholar, 9Rickles R.J. Botfield M.C. Zhou X.M. Henry P.A. Brugge J.S. Zoller M.J. Phage display selection of ligand residues important for Src homology 3 domain binding specificity.Proc. Natl. Acad. Sci. U. S. A. 1995; 92: 10909-10913Crossref PubMed Scopus (175) Google Scholar, 10Sparks A.B. Rider J.E. Hoffman N.G. Fowlkes D.M. Quilliam L.A. Kay B.K. Distinct ligand preferences of Src homology 3 domains from Src, Yes, Abl, Cortactin, p53bp2, PLCγ, Crk, and Grb2.Proc. Natl. Acad. Sci. U. S. A. 1996; 93: 1540-1544Crossref PubMed Scopus (331) Google Scholar, 11Landgraf C. Panni S. Montecchi-Palazzi L. Castagnoli L. Schneider-Mergener J. Volkmer-Engert R. Cesareni G. Protein interaction networks by proteome peptide scanning.PLOS Biol. 2004; 2: 94-103Crossref Scopus (183) Google Scholar, 12Tong A.H.Y. Drees B. Nardelli G. Bader G.D. Brannetti B. Castagnoli L. Evangelista M. Ferracuti S. Nelson B. Paoluzi S. Quondam M. Zucconi A. Hogue C.W.V. Fields S. Boone C. Cesareni G. A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules.Science. 2002; 295: 321-324Crossref PubMed Scopus (581) Google Scholar). of the peptides residues and these residues referred as to from the to a binding peptide only for the Abl SH3 domain, we peptides by amino acids to it residues that the residues not the binding specificity of these not the binding peptides in the that residues on the only a of PXXP peptides are true of a specific SH3 domain G. Panni S. Nardelli G. Castagnoli L. we peptide recognition specificity mediated by SH3 Lett. 2002; PubMed Scopus Google Scholar). the of in the we to the of to of the and SH3 domains and in the C. Panni S. Montecchi-Palazzi L. Castagnoli L. Schneider-Mergener J. Volkmer-Engert R. Cesareni G. Protein interaction networks by proteome peptide scanning.PLOS Biol. 2004; 2: 94-103Crossref Scopus (183) Google Scholar). For the SH3 we peptides that the PXXP motif as from the human proteome in the In and in the A is that the peptides from as true a of peptides that bind to a specific SH3 domain. we this study, only the Abl code M.T. Serrano L. M. of the Abl-SH3 domain with a high-affinity peptide for Biol. PubMed Scopus Google Scholar) and code A. M. M. of SH3 domains with proline-rich Biol. 1994; PubMed Scopus Google Scholar) I peptide in the Protein for the SH3 domains. The Protein are in only the of class II peptides to the SH3 domains of M. C. L. of the SH3 domain with a peptide from and and Biol. PubMed Scopus Google J. C. T. A. R. J.S. P.A. S. 2003; PubMed Scopus Google and C. R.J. interactions the proline-rich core of two of Src homology 3 Natl. Acad. Sci. U. S. A. 1995; 92: PubMed Scopus Google Scholar). For J. C. T. A. R. J.S. P.A. S. 2003; PubMed Scopus Google J. C. T. A. R. J.S. P.A. S. 2003; PubMed Scopus Google A. U. B. of in a of the SH3 domain of by PubMed Scopus Google and J. C. T. A. R. J.S. P.A. S. 2003; PubMed Scopus Google Scholar) of their SH3 domains only the binding for the SH3 and we their from sequence of the SH3 domain using sequence with accuracy and 2004; PubMed Scopus Google Scholar). The the SH3 domains and the SH3 domains that used to the model of the SH3 domain family J. B. S. T. S. M. A. R. A. and PubMed Scopus Google Scholar) The M.A. A. R. A. protein modeling of and 2000; PubMed Scopus Google Scholar) in was used to a homology model for each of the SH3 domains on the sequence The was on sequence the SH3 domains of sequence with the and only two and sequence and with the each was in an of and using the in the in For the SH3 domains I peptide we the or SH3 domain to the of Abl SH3 domain code the structural of the we the peptide in to the peptide to the SH3 domain using B. the accuracy of prediction for Biol. 2001; PubMed Scopus Google Scholar). The by of by molecular The using the T. R. A. C. B. R.J. The PubMed Scopus Google Scholar) and the C. S. M.C. R. P. R. T. J. P. A for molecular of proteins on 2003; PubMed Scopus Google Scholar). The domain-peptide was in a that from any of the SH3 domain on a on the to the was used to the interactions T. D. L. for in 1993; Scopus Google Scholar). The was used to G. of of of a with of Scopus Google and the was In the was from to the The was for and The of the was by of and the was used as the for modeling peptides in the with the SH3 domain. The peptide was to sequence using B. the accuracy of prediction for Biol. 2001; PubMed Scopus Google Scholar). was to that being that most of the as by the peptide was to the II and important between the domain and peptide In and the using the in and not of the of peptides we only each using the in T. R. A. C. B. R.J. The PubMed Scopus Google Scholar) and the C. S. M.C. R. P. R. T. J. P. A for molecular of proteins on 2003; PubMed Scopus Google Scholar). The solvent was using the generalized Born model 2) in G.D. of of on of from a 1996; Scopus Google Scholar). The of was to and the for the of the of the energy was The with the and the of the with the For each the was used to we the residues that of the binding peptide in any of the domain-peptide and as residues important for of the of the SH3 domain binding it is possible that residues important for one SH3 domain not important for SH3 domain this in SH3 domain. a generic model for we a of important from SH3 domains SH3 in a to interactions with the The of these residues was the of the SH3 domain, and these residues the entire peptide-binding surface The most with the PXXP motif, and the most in the the of the SH3 domains and important between the peptide residues and the important SH3 residues The important of the Abl SH3 domain are in as an SH3 domain contain in the sequence and we these for the binding specificity of the domain. the between the peptide residues and the in the SH3 domain to The for each using the in and for generic Sci. 2004; PubMed Scopus Google Scholar). The interaction interaction and to energy The for and was to A of was used to The used in the from the and from of acids with a generalized 2000; Scopus Google The of and in the to and respectively. In we also the for the residue between the residues of the peptides the of the For each + used for the The interaction between an SH3 domain and a peptide was represented by an vector The of X on which in the For example, only was the of X was the of X was The matrix was and used to the A. of Scholar, in Scholar) The of the was for a or for a The was used to the J. The entire was into with used for and the was used for was to the of the For each SVM, true false true and false of the The was by the of the + specificity + prediction accuracy for + prediction accuracy for + and 1) the of and a was to the class the for on the the binding energy for each peptide was using the and P.A. C. B. L. M. T. P. J. and of molecular and 2000; PubMed Scopus Google Scholar, in energy with a of molecular and 2: Scopus Google Scholar, P.A. in of and acid 2001; PubMed Scopus Google 2) is the of molecular energy peptide binding that and and are the and of the and is the of peptide which was not in this of the computational was using the in In was using the in to the The for the was In was using the model with the developed by and of acids with a generalized 2000; Scopus Google Scholar). The of the and to and respectively. was on the solvent-accessible surface as Abl SH3 domain SH3 in was as a protein in by in A a of was with the for The was with and SH3 was in proteins and Protein was determined using the The of the protein was by and The protein was also to by using a and the Peptides on an as using a the A was between the of the peptide and the The peptide with the peptide with SH3 a of in for with the was to a of in for by for with the developed using the a the peptide array was with the a generic model that the energetic of we the for the residue from the domain-peptide and the of SH3 domains the and the energetic of domain-peptide trained on to peptides into a or the of with trained only on of domain-peptide residue that and the two we only on using and we for the of The in domain-peptide and peptide residue and prediction accuracy was by the of and specificity model was used in the of the a this model the that only domain-peptide interactions in I and the in The between the peptide residues the preferences of the binding peptides was important in the binding specificity of we to that the we used to the was in our for in I and a was to the class the of the class of domain-peptide interaction a on the SH3 Biol. PubMed Scopus Google Scholar). The of on was investigated that was a for and prediction the model with was used of the on using the and the and are and to respectively. + and peptide and are and to respectively. + in a M.A. Chen J.R. Grantcharova D. J.E. L.A. G. domain binding is the PubMed Scopus Google Scholar) the binding specificity of domains in the using a protein on the and of a model position-specific matrix and the binding specificity of the domains. which was to the we used achieved and specificity of and with our of and respectively. was which is to using the model with and using the model with in our is that the to in their was which was the of used in our and in false to the of we used the to of the using the model with and false for of and and + and using the model with and false for of and and + Our is to establish a model to characterize the interaction specificity between SH3 domains and their binding the of our we an model was trained using the interaction of domains and the domain was used for interaction of the domain was used in the this was a and the II that the specificity for the domains was the was the specificity a satisfactory of prediction accuracy the to the that our prediction was a was satisfactory it for it was that the and the specificity of our model by M.A. Chen J.R. Grantcharova D. J.E. L.A. G. domain binding is the PubMed Scopus Google using the model in I in a the of our model with the energy and investigated the energy and peptides into a or we the binding energy for each peptide using a the as we used in the that the and of binding for most SH3 domains and between the two SH3 and it is to that between the two in most we trained an on the binding by the using the in the in the and specificity for the SH3 domains that of and respectively. However, false and was with the the and between and of protein and peptide of these for example, or of the by using a for the entire the as an to residue and that are most for as the of the interaction is by the and the is to and as in our is to a prediction accuracy or that in the or the was not it was to from the only from a which was not an example, the binding for the peptides of the Abl SH3 domain using the in the in The the or the was for in the of a of is the between two the of the of the vector was The in the binding and a used to the between the and a of which was a that only on the binding the for the binding specificity of SH3 domains B. A. G. Cesareni G. an algorithm to predict ligands to of the SH3 Biol. 2000; PubMed Scopus Google Scholar, E. A. G. M. A structure-based for to the of SH3 domain PubMed Scopus Google Scholar, L. C. machine learning to protein for protein binding peptide PubMed Scopus Google Scholar) and of are B. A. G. Cesareni G. an algorithm to predict ligands to of the SH3 Biol. 2000; PubMed Scopus Google Scholar, E. A. G. M. A structure-based for to the of SH3 domain PubMed Scopus Google Scholar). A.B. Rider J.E. Hoffman N.G. Fowlkes D.M. Quilliam L.A. Kay B.K. Distinct ligand preferences of Src homology 3 domains from Src, Yes, Abl, Cortactin, p53bp2, PLCγ, Crk, and Grb2.Proc. Natl. Acad. Sci. U. S. A. 1996; 93: 1540-1544Crossref PubMed Scopus (331) Google Scholar) interactions between peptides and SH3 domains which Src, Yes, Abl, and in our the of and our model on the interaction between the peptides and the SH3 domains. is not to these peptides in the of have a of our we these peptides from the and on domains. model was used to predict the binding specificity between the peptides and the SH3 domains The model achieved an accuracy of of the our and and respectively. the our and and respectively. in this peptide array to the of the we and the SH3 protein in E. The protein by an in not we the SH3 protein to an array peptides in The peptide array but one was as a by and Serrano (7Pisabarro M.T. Serrano L. Rational design of specific high-affinity peptide ligands for the Abl-SH3 domain.Biochemistry. 1996; 35: 10634-10640Crossref PubMed Scopus (110) Google but binding to the Abl SH3 domain was not also an array the peptides with the binding of the to the peptides was that the binding in was we the peptide array we the array the peptides and peptides the we peptides the motif that is recognized by the Abl SH3 domain. of the by our peptide in the the peptides as by their and the peptide was as a true by the model and our the peptides, the model trained from the SH3 domains peptides in the for the Abl SH3 domain, and and one of the by our peptide array we the peptide array as the the of the model on the peptides is as specificity prediction accuracy prediction accuracy and the peptide array that the prediction accuracy of our was to the we the of each in the protein or the peptide to the binding we a the that one protein or peptide from the and an was on the The of the was by the of the of the that of the peptides to the binding specificity. was the most important as by the of the the and the two in peptides, and of a the In SH3 to the most to the binding specificity and their are in and D. these of and interactions with the PXXP of the binding The and are in the regions of the SH3 domain and are in interactions with the residues of the binding and are not SH3 that important for the binding specificity. The binding specificity of modular domains has been extensively using experimental and computational For example, in a of the interaction between domains and peptides using a protein M.A. Chen J.R. Grantcharova D. J.E. L.A. G. domain binding is the PubMed Scopus Google Scholar) that the peptide recognized by domains not into rather are in the However, their on individual domains and not a model that the of the domain-peptide the of amino acids and their a peptide not the of peptides that bind to the domain family. a the binding specificity of a domain that was not in the not a general framework that can used to decipher the protein recognition code of the entire domain the energetic of the domain-peptide interaction with SVM, has a prediction for the binding specificity of SH3 domains. we only domains in this study, the that the model was to any SH3 domain. is is a and it not on amino acid as the to the binding energy is an amino acid is a peptide The of suggests that our a generic to In the in the of the SH3 domains and their with peptides, the peptide binding of SH3 domains to each the binding from SH3 domains into a structure-based prediction model can the prediction accuracy as as the of the with our has the between each individual peptide and domain is and that the is The between peptide residues also the of the the between residues is into by modeling and SVM, a the of the interaction it is to in modeling and energy with that peptides on binding the matrix used in as the matrix is a matrix the interactions between residue are represented by energy of amino acid For this matrix is and to or the In the satisfactory of our model in the of the in the and the between the prediction and the experimental that a to the recognition code of the SH3 domain family. Our the of to human and to the also experimental of the of Our a generic framework that can to or systems as this has the of the human that to and J. of the using molecular interaction energy PubMed Scopus Google Scholar). on the in the for the of are to J. and Taylor for peptide array as as the SH3 domain and on peptide array J. for to the and molecular
Building similarity graph...
Analyzing shared references across papers
Loading...
Tingjun Hou
Zhejiang Lab
Zheng Xu
Wei Zhang
View
Molecular & Cellular Proteomics
University of California, San Diego
Scripps Research Institute
Building similarity graph...
Analyzing shared references across papers
Loading...
Hou et al. (Thu,) studied this question.
synapsesocial.com/papers/6a1c9caf1b79c159c356ddae — DOI: https://doi.org/10.1074/mcp.m800450-mcp200