What question did this study set out to answer?

The goal is to enhance material discovery in catalysis by applying Bayesian optimization using in-context learning with frozen large language models.

April 17, 2026Open Access

Bayesian Optimization of Catalysis with In-Context Learning

Key Points

The goal is to enhance material discovery in catalysis by applying Bayesian optimization using in-context learning with frozen large language models.
Utilized frozen LLMs such as GPT-4o and Gemini for regression tasks.
Implemented Bayesian optimization without explicit model training or feature engineering.
Represented materials as natural language prompts for synthesis and testing procedures.
Conducted live experiments on the reverse water-gas shift reaction.
Achieved competitive results on aqueous solubility and oxidative coupling of methane benchmarks compared to Gaussian processes.
Identified effective multimetallic catalysts yielding near equilibrium CO in only 6 to 10 iterations from large candidate pools.

Abstract

Large language models (LLMs) can perform accurate classification with zero or few examples through in-context learning (ICL), allowing the model to observe query-relevant examples at inference time and eliminating the need for additional weight updates to generalize beyond its original training data. We extend this capability to regression with uncertainty estimation using frozen LLMs (e.g., GPT-4o, Gemini), enabling Bayesian optimization (BO) in natural language without explicit model training or feature engineering. We apply this to materials discovery by representing materials as synthesis and testing procedures for use in natural language prompts. This Bayesian, design-first approach prioritizes optimization toward target material properties before detailed characterization, in contrast to conventional experimental workflows that often emphasize characterization of suboptimal materials. On benchmarks like aqueous solubility and oxidative coupling of methane (OCM), BO-ICL matches or outperforms Gaussian processes. In live experiments on the reverse water–gas shift (RWGS) reaction, BO-ICL identifies multimetallic catalysts that approach equilibrium CO yield within 6 and 10 iterations from a pool of 3,700 and 360,000 candidates, respectively. Our method redefines materials representation and accelerates discovery, with broad applications across catalysis, materials science, and AI. Code: https://github.com/ur-whitelab/BO-ICL.

Bayesian Optimization of Catalysis with In-Context Learning

Key Points

Abstract

Cite This Study