Earth system models, or simulators, are foundational for projecting climate change impacts, but their computational expense limits the number and diversity of simulations available. Machine learning-based emulators, statistical surrogates trained on simulator outputs, can replicate components of climate models at orders-of-magnitude lower cost, enabling ensembles and interpolation across scenarios. We argue that the next phase of climate modeling hinges on closer collaboration between simulator and emulator communities. We outline three priorities: (1) co-design of simulators and emulators so that experimental design, diagnostics, and data products support training, evaluation, and targeted simulation; (2) shared, machine learning-ready benchmarks with data partitions and metrics that emphasize physical fidelity; and (3) treating emulators as reliable software components with interfaces, documentation, and deployment pathways for sensitivity analyses, scenario exploration, and uncertainty decomposition. This perspective envisions emulators not as statistical shortcuts, but core tools that accelerate the pace of climate science. This Perspective argues that machine learning emulators could transform climate modeling by co-designing with simulators, aligning goals, data, and diagnostics, and building shared infrastructure and robust software to accelerate science.
Katwyk et al. (Fri,) studied this question.