What question did this study set out to answer?

May 20, 2026

C105-17 Algorithmic Estimation of Arterial Partial Pressure of Oxygen (PaO2) From Oxygen Saturation (SpO2) Values in Critical Care Patients at the Department of Veterans Affairs

Key Points

To estimate arterial partial pressure of oxygen (PaO2) from oxygen saturation (SpO2) values using machine learning.
Retrospective collection of 7,656 PaO2 measurements from 1,897 veterans in VA ICUs.
Data was divided into training, validation, and test subsets with Extreme Gradient Boosting (XGBoost) for predictions.
XGBoost performance compared with the Non-Linear Imputation method using RMSE and statistical tests.
XGBoost model achieved RMSE of 75.2403, outperforming NLI with RMSE of 237.3875 (t-statistic 19.542, p < 0.001).
In a restricted SpO2 range (80%-96%), XGBoost had RMSE of 77.3092, while NLI had RMSE of 80.7686 (t-statistic = 3.926, p < 0.001).
The model was robust against transient fluctuations in SpO2, improving PaO2 prediction accuracy.

Abstract

Abstract Rationale Acute Respiratory Distress Syndrome (ARDS) affects 218,000 to 268,000 in the United States annually. ARDS requires critical care and is underrecognized, defined by a PaO2/FiO2 ratio 300 mmHg. The Global ARDS criteria include oxygen saturation measurements (SpO2) and SpO2/FiO2 in addition to the invasive but more precise PaO2 obtained from arterial blood gas (ABG). Previous works attempted to estimate PaO2 using SpO2 data but were limited to saturations = 96% and assumed coincidental timing between PaO2 and SpO2 measurements. We applied a machine learning approach to estimate PaO2 while accounting for temporal variance in SpO2 data in a national dataset. Methods We retrospectively collected 7,656 PaO2 measurements from 1,897 veterans admitted across 83 Veterans Affairs (VA) intensive care units affiliated with the VA National TeleCC Program between March 1 and April 15, 2025. Only ABG measurements with 25 minutes of preceding SpO2 data were included. We divided data into training, validation, test subsamples (80/20/20) at the patient level, stratified based on PaO2 values. We trained Extreme Gradient Boosting (XGBoost) Regressors using the training data and tuned hyperparameters by minimizing the root mean squared error (RMSE) on the validation subset. The XGBoost model was used to predict PaO2 on the test set and compared to the previous Non-Linear Imputation (NLI) method using RMSE performance. Finally, we calculated Spearman’s rho correlation and paired t-test values to compare results. Results Hyperparameter tuning indicated the optimal XGBoost model operates on the preceding 15 minutes of SpO2 data sampled in 5-minute intervals to predict a PaO2 value. For the full range of clinical data, the XGBoost model (RMSE: 75.2403, rho = 0.419, p 0.001) significantly outperformed the previous NLI method (RMSE: 237.3875, rho = 0.411, p 0.001) (t-statistic 19.542, p 0.001). When restricting SpO2 values to the 80%-96% range that is indicated for the NLI derivation and optimal for pulse oximetry, performance between both methods was more comparable; however, XGBoost (RMSE: 77.3092, rho = 0.196, p 0.001) still significantly outperformed NLI (RMSE: 80.7686, rho = 0.218, p 0.001) (t-statistic =3.926, p 0.001). Conclusions A machine learning model seamlessly predicted PaO2 across a clinically relevant range of SpO2 values, expanding to saturations above 96%. The model is robust to transient fluctuations in SpO2 and more consistently matched observed PaO2 values. Future refinements will include self-reported race or melanin index adjustments for pulse oximetry, or elevation adjustments for arterial gas equilibria. This abstract is funded by: Department of Veterans Affairs Cooperative Studies Program, CSP #2040

Bookmark

Cite This Study

Fricks et al. (Fri,) studied this question.

synapsesocial.com/papers/6a0d4efcf03e14405aa9a3ed https://doi.org/https://doi.org/10.1093/ajrccm/aamag162.219

Bookmark