Abstract Existing studies evaluating imputation methods on clinical time series do not jointly account for spatial (e.g., across features) and temporal (across time) structure in missingness. We present a novel simulation model that induces realistic spatio-temporal missingness in three real-world clinical datasets by sampling from Markov chains. On the resulting data, we apply a variety of imputation methods including Last observation Carried Forward (LOCF), linear interpolation, and spatio-temporal autoencoder (STAE). Finally, we evaluate the influence of time series imputations on a downstream prediction task. Deep STAE outperformed simpler baseline methods only when the missingness patterns lacked spatio-temporal structure. In contrast, linear interpolations performed best on such spatio-temporal missingness, contributing to sufficient downstream performances. This study is the first to demonstrate that simple linear imputations attain robust performance for clinical real-world data with spatio-temporal missingness. We provide an open benchmark to evaluate the effect of missingness in clinical studies and prediction tasks.
Giesa et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: