Humans have 437 catalytically competent protein kinase domains with a typical kinase fold resembling Protein Kinase A. The active form of a kinase must satisfy requirements for binding ATP, magnesium, and substrate. From structural bioinformatics analysis of 248 crystal structures of 54 kinase-substrate complexes, we derived structural criteria for the active form of typical protein kinases. We include well-known requirements on the DFG motif of the activation loop (ActLoop) and the N-terminal domain salt-bridge, but also on substrate-compatible states of the ActLoop N-terminal and C-terminal segments. With these criteria, only 123 of the 437 human catalytic protein kinases (cPKs) have active forms in the Protein Data Bank. Because the active forms are needed for understanding substrate specificity and mutational effects on catalytic activity in cancer and other diseases, we used AlphaFold2 to produce active models of all 437 human cPKs. This was accomplished with PDB templates that resemble substrate-bound structures, shallow sequence alignments of close paralogs/orthologs, and application of the active-kinase criteria to the output models. We selected models for each kinase based on (intramolecular) ActLoop ipSAE scores and show that the highest scoring models tend to have the lowest RMSD to substrate-bound PDB structures. In a benchmark of 117 kinases, 92% have a highest-scoring AlphaFold2 model with backbone RMSD < 2.0 Å to their benchmark active structure. Models for all 437 cPKs are available at https://dunbrack.fccc.edu/kincore/activemodels. We believe they may be useful for interpreting mutation-induced constitutive activity and as templates for modeling substrate and inhibitor binding to the active-state.
Gizzio et al. (Mon,) studied this question.