Abstract Rationale Acute Respiratory Distress Syndrome (ARDS) affects 218,000 to 268,000 in the United States annually. ARDS requires critical care and is underrecognized, defined by a PaO2/FiO2 ratio 300 mmHg. The Global ARDS criteria include oxygen saturation measurements (SpO2) and SpO2/FiO2 in addition to the invasive but more precise PaO2 obtained from arterial blood gas (ABG). Previous works attempted to estimate PaO2 using SpO2 data but were limited to saturations = 96% and assumed coincidental timing between PaO2 and SpO2 measurements. We applied a machine learning approach to estimate PaO2 while accounting for temporal variance in SpO2 data in a national dataset. Methods We retrospectively collected 7,656 PaO2 measurements from 1,897 veterans admitted across 83 Veterans Affairs (VA) intensive care units affiliated with the VA National TeleCC Program between March 1 and April 15, 2025. Only ABG measurements with 25 minutes of preceding SpO2 data were included. We divided data into training, validation, test subsamples (80/20/20) at the patient level, stratified based on PaO2 values. We trained Extreme Gradient Boosting (XGBoost) Regressors using the training data and tuned hyperparameters by minimizing the root mean squared error (RMSE) on the validation subset. The XGBoost model was used to predict PaO2 on the test set and compared to the previous Non-Linear Imputation (NLI) method using RMSE performance. Finally, we calculated Spearman’s rho correlation and paired t-test values to compare results. Results Hyperparameter tuning indicated the optimal XGBoost model operates on the preceding 15 minutes of SpO2 data sampled in 5-minute intervals to predict a PaO2 value. For the full range of clinical data, the XGBoost model (RMSE: 75.2403, rho = 0.419, p 0.001) significantly outperformed the previous NLI method (RMSE: 237.3875, rho = 0.411, p 0.001) (t-statistic 19.542, p 0.001). When restricting SpO2 values to the 80%-96% range that is indicated for the NLI derivation and optimal for pulse oximetry, performance between both methods was more comparable; however, XGBoost (RMSE: 77.3092, rho = 0.196, p 0.001) still significantly outperformed NLI (RMSE: 80.7686, rho = 0.218, p 0.001) (t-statistic =3.926, p 0.001). Conclusions A machine learning model seamlessly predicted PaO2 across a clinically relevant range of SpO2 values, expanding to saturations above 96%. The model is robust to transient fluctuations in SpO2 and more consistently matched observed PaO2 values. Future refinements will include self-reported race or melanin index adjustments for pulse oximetry, or elevation adjustments for arterial gas equilibria. This abstract is funded by: Department of Veterans Affairs Cooperative Studies Program, CSP #2040
Fricks et al. (Fri,) studied this question.