What question did this study set out to answer?

This research aims to improve molecular generation processes to discover better-performing molecules using active learning.

February 12, 2026Open Access

Active learning enables generation of molecules that advance the known Pareto front

Key Points

This research aims to improve molecular generation processes to discover better-performing molecules using active learning.
Develop a closed-loop molecule generation pipeline.
Utilize iterative retraining on new quantum chemical simulation data.
Condition molecular generation on thermodynamic stability data.
Generated molecules improved by up to 0.44 standard deviations beyond the original training range.
Achieved a 79% improvement in out-of-distribution molecule classification accuracy.
The proportion of stable, synthesizable molecules was 3.5 times higher than the next-best model.

Abstract

Abstract Although generative models hold promise for discovering molecules with optimized desired properties, they often fail to suggest synthesizable molecules that improve upon the properties of the structures represented in the training distribution. We find that this limitation arises not only from the molecule generation process itself, but also from the poor generalization capabilities of molecular property predictors. We address this challenge by creating a closed-loop molecule generation pipeline with iterative retraining on new quantum chemical simulation data. Compared against static, single-pass generative modeling approaches, only our closed-loop iterative workflow generates molecules with properties extending beyond the training distribution (up to 0.44 standard deviations beyond the original range) and achieves a 79% improvement in out-of-distribution molecule classification accuracy. Furthermore, by conditioning molecular generation on thermodynamic stability data obtained during the iterative loop, the proportion of stable and hence potentially synthesizable molecules generated is 3.5x higher than the next-best model.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Evan R. Antoniuk

Peggy Li

Nathan Keilbart

Journals

npj Computational Materials

Actions

Institutions

Lawrence Livermore National Laboratory

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Active learning enables generation of molecules that advance the known Pareto front

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider