As atmospheric CO₂ concentrations continue to rise (globally averaged ~420 ppm in 2023), developing a high-spatiotemporal-resolution and operationally feasible capability for monitoring column-averaged CO₂ concentration has become critical for supporting carbon cycle science and emissions assessments. Major challenges in current CO₂ retrieval research include the high computational cost of traditional physical inversion methods and their sensitivity to clouds/aerosols and geometric/instrumental errors (which can greatly increase uncertainties under complex conditions), as well as insufficient cross-platform (satellite–ground) consistency, limited temporal generalization, and high inference costs in practical deployments. To address these issues, we propose a data-driven CO₂ retrieval framework with a “satellite–ground strong constraint”: using Orbiting Carbon Observatory-2 (OCO-2) spectra along with auxiliary information (e.g., geometry and aerosols) as inputs, supervised by strictly co-located Total Carbon Column Observing Network (TCCON) observations. The framework employs an enhanced Transformer regression model, a “prior main component – nonlinear residual” decomposition strategy, and a two-stage fine-tuning plus calibration procedure. We evaluate the approach on a 2015–2018 training set and a fully held-out 2019 test set (to mimic operational deployment) with no data leakage. The results demonstrate that our method significantly improves cross-platform consistency (coefficient of determination R² improved from ~0.27 to ~0.96) while providing lightweight, fast inference (processing 6,728 soundings end-to-end in ~2.44 s on CPU and ~3.46 s on GPU, i.e.,on the order of 0.4–0.5 milliseconds per sample). This work provides a verifiable pathway and practical basis for near-real-time, high-reliability CO₂ retrieval and the timely monitoring of abnormal emission events.
Yu et al. (Thu,) studied this question.