The CoreTrustSeal-certified institutional data repository RDR, based on Dataverse, is essential to KU Leuven’s efforts to guarantee the quality of published datasets and to support researchers in making their data FAIR. To ensure that datasets are well-documented, properly licensed, and accompanied by accurate metadata, KU Leuven places a strong emphasis on curation before publication. Since the launch in 2022, the quantity of datasets submitted for publication has increased consistently. This growth brought new challenges, like tracking review assignments, avoiding duplicated efforts, and managing administrative workload. To address these issues, the RDR team developed an open-source review dashboard that integrates with Dataverse and automates several aspects of the curation process. The initial version of the dashboard was designed to enhance the consistency and transparency of the workflow. Reviewers can effortlessly identify the datasets that are currently under review, leave notes, and access review histories. A built-in checklist helps generate standardized feedback while still allowing for personalized comments. This development/tool has significantly streamlined the process and provided additional time for more personalized support, training and consultation. Building on this foundation, a second version of the dashboard introduced automated checks for identifying issues like incomplete metadata, missing or empty README files, and other commonly made errors. Many of these checks were straightforward to implement and demonstrated their effectiveness in identifying errors at an early stage. Crucially, the dashboard was made to complement human judgment rather than to take its place. Automated checks are clearly visualized, and reviewers can override them when needed. This ensures that automation enhances the review process without sacrificing the nuanced interpretation provided by human curators. The RDR team is currently working on extending the benefits of this tool beyond the curation team. By investigating the integration of automated checks as an external tool, potentially in Dataverse, researchers will soon be able to run a pre-submission validation of their draft dataset. This will help them identify and resolve common issues early on, thereby enhancing the transparency of curation, reducing the time required for review, and improving the quality of submissions.In this presentation, we'll share the development journey of the review dashboard and explore the logic behind the automated checks. Our objective is to provide valuable insights for other institutions that are interested in optimizing their FAIR-aligned data infrastructures and to stimulate discussion on how automation can empower both curators and researchers.
Marynissen et al. (Wed,) studied this question.