Zinc ions serve dual roles in proteins: as catalytic cofactors and as structural elements. Distinguishing these functional classes from sequence alone remains challenging, because both share similar coordination geometries. Here, we demonstrate that ESM-2 embeddings encode sufficient information to classify catalytic versus structural zinc sites with high accuracy. On 73 sequence-diverse zinc proteins, machine learning classifiers achieve ROC-AUC of 0.93-0.97, significantly outperforming a motif-based baseline (AUC = 0.759; p = 0.015). Attention analysis reveals that histidine ligands in catalytic sites attend 9.2-fold more strongly to second-shell carboxylate residues─the proton-shuttling machinery essential for catalysis─than to random positions, providing mechanistic interpretability. These findings suggest that evolutionary sequence patterns encode the extended hydrogen-bonding networks distinguishing catalytic from structural sites. This sequence-only approach complements structure-based methods for large-scale metalloproteome annotation.
Building similarity graph...
Analyzing shared references across papers
Loading...
Karen Sargsyan
Institute of Sociology, Academia Sinica
Journal of Chemical Information and Modeling
Institute of Chemistry, Academia Sinica
Institute of Sociology, Academia Sinica
Building similarity graph...
Analyzing shared references across papers
Loading...
Karen Sargsyan (Fri,) studied this question.
synapsesocial.com/papers/69a3d8a7ec16d51705d2fb01 — DOI: https://doi.org/10.1021/acs.jcim.5c03142
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: