The completeness and integrity of multimodal medical data are critical determinants of surgical success and postoperative recovery. However, because of issues such as poor sensor contact, small vibrations, and device discrepancies during signal acquisition, there are frequent missing values in patients’ medical data. This issue is especially prominent in rare or complex cases, where the inherent complexity and sparsity of multimodal data limit dataset diversity and degrade predictive model performance. As a result, clinicians’ understanding of patient conditions is restricted, and the development of robust algorithms to predict preoperative, intraoperative, and postoperative disease progression is hindered. To address these challenges, we propose Med-Diffusion, a diffusion-based generative framework designed to enhance sensor data by imputing missing multimodal clinical data, including both categorical and numerical variables. The framework integrates one-hot encoding, simulated bit encoding, and feature tokenization to improve adaptability to heterogeneous data types, utilizing conditional diffusion modeling for accurate data completion. Med-Diffusion effectively learns the underlying distributions of multimodal datasets, synthesizing plausible data for incomplete records, and it mitigates the data sparsity caused by poor sensor contact, vibrations, and device discrepancies. Extensive experiments demonstrate that Med-Diffusion accurately reconstructs missing multimodal clinical information and significantly enhances the performance of downstream predictive models.
Cheng et al. (Sun,) studied this question.