What question did this study set out to answer?

The aim is to improve the population of open repositories with high-quality metadata and corresponding full texts using automation.

June 22, 2026Open Access

Developer Track: DSpace 2 (Automated Metadata and Full Text Population)

Key Points

The aim is to improve the population of open repositories with high-quality metadata and corresponding full texts using automation.
Developed a DOI-driven workflow to enrich metadata records with full text links.
Implemented API-based real-time metadata extraction from uploaded documents using AI.
Ensured privacy and interoperability through flexible API frameworks.
Automatically identified and attached thousands of full text PDFs with minimal manual intervention.
Reduced manual metadata entry burden, improving submission processes and efficiency.
Enabled adaptation for future technological changes, enhancing repository infrastructure.

Abstract

2 presentations:Fill The Gap – Automated retrieval of full text from emerging open APIsCillian Joy (1), Bram Luyten (2)(1) University of Galway, Ireland; (2) AtmirePopulating an open repository with high-quality, consistent metadata is a substantial task. The challenge becomes even harder when records also need to be enriched with the corresponding full text at scale, particularly when authors are not involved in deposit workflows. In 2025, the University of Galway and Atmire developed and deployed a DOI-driven workflow to enrich metadata-only repository records with open full text links. The tool queries multiple open services using the DOI, selects the most credible full text candidate, and records both provenance and outcomes to support review and reporting. In production, this approach identified and attached thousands of full text PDFs with minimal manual intervention, while surfacing cases that require follow-up due to redirects, inconsistent landing pages, or unclear licensing signals. The implementation is designed to be extensible, with additional sources and local policy rules added as needed. The session will demonstrate the Google Apps Script and Google Sheets version, describe key design trade-offs (accuracy, coverage, validation, and rate limiting), and share an approach that other repository teams can adapt to their own infrastructure. Currently supported sources include OpenAIRE, Unpaywall, CORE, and OpenAlex.Reducing Barriers: Automating Metadata Extraction in Submission Forms for DSpace RepositoriesJosé Carvalho, Carlos Silva, Paulo Lima, Henrique Malheiro, Pedro Pinto, Tiago OliveiraKEEP Solutions, PortugalAs digital repositories evolve at the intersection of people, practice, and emerging technologies, the burden of manual metadata entry remains a significant barrier to the timely dissemination of open research. This paper presents a novel integration for the DSpace platform designed to streamline the submission process through automated metadata extraction. The proposed functionality leverages an external API powered by Artificial Intelligence (AI) to analyze uploaded documents in real-time. By identifying and mapping key bibliographic data directly from the file content, the system automatically populates submission forms, reducing human error and cognitive load for depositors. Central to this development are two critical considerations: interoperability and privacy. The architecture utilizes a flexible API framework that allows the repository to request services from various external providers, ensuring the system remains adaptable to future technological shifts. Furthermore, the integration is built with a "privacy-by-design" approach, ensuring that sensitive file data is handled securely during the AI analysis phase. By automating the "practice" of data entry, this feature moves us closer to an "Open to All" ecosystem where researchers can focus on dissemination rather than administration, ultimately fostering a more efficient and inclusive repository environment.

Developer Track: DSpace 2 (Automated Metadata and Full Text Population)

Key Points

Abstract

Cite This Study