Biomedical data is scattered across diverse databases, each with a different scope and design, making integration and analysis challenging. This project creates an open, interoperable framework to connect these databases, enabling efficient access and data use. By applying schema harmonization, aligning database designs, and using open science approaches, it promotes data transparency, reusability, and interoperability. Our approach implements the FAIR (Findable, Accessible, Interoperable, and Reusable) principles, ensuring analysis across databases is accurate and transparent, e.g., when combining gene expression data with gene variant knowledge. The resulting access to the integrated databases simplifies data exploration and maximizes the scientific insights.
Willighagen et al. (Thu,) studied this question.