What type of study is this?

September 5, 2025Open Access

Benchmarking imputation methods on real-world clinical time series with simulated spatio-temporal missingness

Key Points

Linear interpolation outperformed deep STAE when dealing with spatio-temporal missingness.
The simulation model used in this study samples from markov chains to create realistic missingness patterns.
This study demonstrates the effectiveness of simple imputation methods in clinical datasets with complex missingness.
An open benchmark for evaluating missingness in clinical studies was provided, enhancing future research.

Abstract

Abstract Existing studies evaluating imputation methods on clinical time series do not jointly account for spatial (e.g., across features) and temporal (across time) structure in missingness. We present a novel simulation model that induces realistic spatio-temporal missingness in three real-world clinical datasets by sampling from Markov chains. On the resulting data, we apply a variety of imputation methods including Last observation Carried Forward (LOCF), linear interpolation, and spatio-temporal autoencoder (STAE). Finally, we evaluate the influence of time series imputations on a downstream prediction task. Deep STAE outperformed simpler baseline methods only when the missingness patterns lacked spatio-temporal structure. In contrast, linear interpolations performed best on such spatio-temporal missingness, contributing to sufficient downstream performances. This study is the first to demonstrate that simple linear imputations attain robust performance for clinical real-world data with spatio-temporal missingness. We provide an open benchmark to evaluate the effect of missingness in clinical studies and prediction tasks.

Benchmarking imputation methods on real-world clinical time series with simulated spatio-temporal missingness

Key Points

Abstract

Cite This Study

Also Consider

Also Consider