• A quantitative analysis of operating conditions and performance in CO 2 capture. • Large-scale analysis examines 4,123 articles across major CO 2 capture technologies. • Automated framework extracts operating and energy performance data from full texts. • Technology-specific trends reveal dominant regimes and underexplored conditions. The rapid growth of carbon dioxide (CO 2 ) capture research has generated a large and heterogeneous body of scientific literature, making it increasingly difficult to systematically identify trends and guide the development of new technologies. In this work, we perform a quantitative analysis of operating conditions and performance trends in large-scale CO 2 capture literature, enabling a data-driven comparison across absorption-, adsorption-, and membrane-based capture technologies. To achieve this, we develop a fully automated framework that integrates topic modeling with domain-specific named entity recognition for large-scale extraction of operating conditions and energy-related performance information from scientific texts. A literature corpus published between 2005 and 2025 was compiled and curated, resulting in 4,123 full-text articles related to absorption-, adsorption-, and membrane-based CO 2 capture technologies. A MatBERT-CRF ensemble model trained on an expert-annotated dataset achieved an average entity-level F1 score of 81.9%, enabling reliable extraction of operating conditions and energy-related performance metrics. Analysis of the extracted dataset reveals distinct temporal trends in research activity, technology-specific differences in energy performance and operating conditions, and a strong concentration of studies near ambient operating regimes. Overall, this work demonstrates how large-scale literature can be transformed into structured, quantitative datasets, providing a scalable approach for data-driven evaluation of CO 2 capture technologies.
Jeong et al. (Sun,) studied this question.