What question did this study set out to answer?

This research aims to explore how the combination of protein domains influences their functions.

March 4, 2026Open Access

Vector Semantics of Multidomain Protein Architectures

Key Points

This research aims to explore how the combination of protein domains influences their functions.
Used vector embeddings to model protein domain content.
Analyzed multidomain protein architectures for functional relationships.
Compared functional similarities based on domain content and contextual signals.
Semantically similar multidomain architectures share more functional attributes.
Identified high functional similarity in architecture pairs with no common domains.
Demonstrated the importance of context in understanding protein function evolution.

Abstract

Abstract Multidomain proteins are mosaics of domains, protein modules that are associated with a specific structure or function and are found in diverse combinations. This modular organization facilitates the evolution of novel protein functions, but the principles that govern the relationship between the domain content of a protein and its function is poorly understood. In particular, do domains always contribute the same function, or does the functional contribution of a domain depend on the neighboring domains in the protein? To answer this question, we used vector embeddings, which account for local contextual signals, to model the protein domain content of multidomain proteins. We observe that multidomain architectures that are semantically similar share more functional attributes than multidomain architectures selected based on domain content similarity, alone, suggesting that context is important for understanding the relationship between domain content and protein function. Surprisingly, vector semantics also identified multidomain architecture pairs with significantly high functional similarity, despite having no domains in common at all, suggesting that vector semantics may be discovering domain “synonyms”. Taken together, our results underscore the importance of contextual models for understanding the interplay between domain architecture evolution and functional innovation in multidomain proteins.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Cui et al. (Sat,) studied this question.

synapsesocial.com/papers/69a7cd8cd48f933b5eeda044 https://doi.org/https://doi.org/10.1093/bioadv/vbag037

Bookmark

View Full Paper