The use of generative machine learning models, trained on the experimentally resolved structures deposited in the protein data bank, is an attractive approach to sampling conformational ensembles of proteins. However, the ensembles generated by these models lack time scale or causal information. We use the structural ensembles generated from AlphaFold2 at a range of MSA depths to parametrize the potential of mean force of an overdamped, memory-free, coarse-grained Langevin equation. This approach couples the AlphaFold2 ensembles to a causal model, allowing us to estimate the time scales spanned by the ensembles generated at each MSA depth. Performing this analysis on six variants of HIV-1 protease, we confirm an inverse relationship between MSA depth and the time scale of an ensemble's conformational fluctuations. The MSA depth essentially serves as a conformational restraint, and AlphaFold2 is generally able to probe time scales at or below those seen in microsecond-long, unbiased molecular dynamics simulations. We conclude by generalizing this approach to other generative structural ensemble-prediction methods as well as cofolding models, in this case, the biologically functional HIV-1 protease dimer.
Aranganathan et al. (Tue,) studied this question.