What question did this study set out to answer?

To evaluate the impact of amino acid physicochemical features on improving homolog detection in protein structure prediction.

February 14, 2026Open Access

Enhancing protein structure prediction: evaluating the role of amino acid physicochemical features in homology search

Key Points

To evaluate the impact of amino acid physicochemical features on improving homolog detection in protein structure prediction.
Development of DIAFold as a prefiltering strategy for homology search.
Implementation of DIAMOND in a single-pass setting.
Comparison of performance on alignment quality and computational speed.
Achieved a 5.91× speedup in homology searches.
Reduced false positives by up to 37.7×.
Produced smaller yet higher-quality multiple sequence alignments.
Preserved or improved structure prediction accuracy, especially in low-homology situations.

Abstract

Abstract Computational models like AlphaFold2 have achieved high accuracy in protein structure prediction, but their homology search step—key to generating multiple sequence alignments (MSAs)—remains computationally expensive and prone to introducing alignment noise. We propose DIAFold, which incorporates amino acid physicochemical properties as a cost-free prefiltering strategy to improve homolog detection by prioritizing biologically meaningful MSAs over exhaustive high-sensitivity searches, using DIAMOND in a fast, single-pass setting. This yields a 5.91× speedup and reduces false positives by up to 37.7× while producing smaller yet higher-quality MSAs and preserving or improving structure prediction accuracy, particularly in low-homology regimes. These gains translate to higher TM-scores in full-chain and domain-level predictions, using fewer computational resources, highlighting the benefits of integrating physicochemical knowledge early in protein structure prediction pipelines.

Enhancing protein structure prediction: evaluating the role of amino acid physicochemical features in homology search

Key Points

Abstract

Cite This Study