Digital phantoms are virtual representations of the human body used in medical research to test equipment, train medical professionals and develop or validate algorithms. These models can be created from ‘real-world’ clinical data or from ‘synthetic data’. Phantoms derived from clinical data often serves as ‘ground truth’ reference values anchored in empirical observations. However, there is growing demand for synthetic digital phantoms and datasets that do not originate from real patients, raising critical questions about how reliable knowledge is produced from data detached from reality. This article aims to investigate these issues through a document analysis of peer-reviewed publications on the development and use of digital phantoms in medical physics. We examine how researchers construct ‘ground truth’ and the challenges they encounter when advancing truth claims through technical work. By attending to the bodies fabricated in phantom creation and to the data made to represent human form, we show how synthetic data – detached from real human subjects – are valued for enabling researchers to sidestep the complexities or ‘messiness’ of real-world patients and clinical data. Moreover, we show how synthetic phantoms and data are framed as tools that enhance control and flexibility, functioning as ‘known truths’: workable approximations that enables the construction of what are claimed to be more representative datasets and models. This article contributes to Science and Technology Studies and critical data studies by examining the nature and implications of digital representations and synthetic data in the development of machine-learning models in medicine, and the truth claims they support.
Högberg et al. (Mon,) studied this question.