Objectives/Goals: Synthetic data holds potential for inclusion in medical product development pipelines. Therefore, we explored the current regulatory and practice landscape to identify best practices used to ensure, and communicate to key stakeholders, synthetic data quality, relevance, and reliability in regulatory settings. Methods/Study Population: We identified areas in which synthetic data created using generative AI holds the most value for health researchers. We reviewed regulatory documents, published literature, and expert insights to examine how regulators currently use, define, apply, and govern synthetic data created using generative AI. Next, we identified data management tools, best practices, ethical considerations, and regulatory developments that are necessary for generating fit-for-use synthetic datasets. Using this information, we developed a risk-based credibility assessment framework that aligns with current governmental standards that can be useful for users of synthetic data derived from generative AI applications in regulatory settings. Results/Anticipated Results: Synthetic data created using generative AI holds value for health researchers across four key areas: Acting as a 1) privacy-enhancing technology, 2) data science ’sandbox’ for training and exploration, 3) mechanism to navigate legalities around data sharing and/or use, and 4) method to augment underrepresented subgroups in datasets. However, synthetic data raise ethical and legal concerns, particularly regarding privacy, consent, stakeholder engagement, and ownership. Regulators and health technology assessment bodies, including FDA, EMA, MHRA, and Canada’s Drug Agency, are exploring synthetic data to supplement medical datasets, validate external control arms, enhance model performance, and inform regulatory decision-making. Discussion/Significance of Impact: Our work underscores the need to continue cultivating transparent, ethical, and fit-for-use approaches to synthetic data generation using generative AI. Moving forward, effective synthetic data use and development requires a culture of learning and transparency among regulators, end users, and those involved in data generation and exchange.
Hendricks-Sturrup et al. (Wed,) studied this question.