What does this research mean for the field?

Large-scale AI data harvesting places significant strain on the technical infrastructure of digital libraries, forcing institutions to restrict access and challenging long-standing commitments to open public service. Novelty: ClaimNovelty.SYNTHESIS. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

This panel examines the challenges posed by AI-driven data harvesting on openness and repository stewardship.

June 23, 2026Open Access

When Openness Meets a Breaking Point: Perspectives on capacity, responsibility and stewardship under the threat of AI-driven harvesting.

Key Points

This panel examines the challenges posed by AI-driven data harvesting on openness and repository stewardship.
Cross-institutional panel discussion
Engagement with diverse perspectives from repository stakeholders
Focus on ethical and operational implications of AI
AI-driven harvesting strains technical infrastructure and human resources
Some institutions limit access, impacting commitments to openness
Panel promotes dialogue on responsibility and public service in the AI context

Abstract

The growing prevalence of artificial intelligence has renewed attention on the role of data in training large language models (LLMs). For decades, digital libraries and repositories have focused on providing well-structured, searchable, and openly accessible information to the public. As a result, these systems have become major targets for large-scale AI data harvesting. The volume and intensity of automated access now place significant strain on technical infrastructure and on the people who maintain it, often exceeding the capacity intended to serve human users. In response, some institutions have limited access or taken systems offline, raising challenges to long-standing commitments to openness and public service. This panel addresses the operational, ethical, and strategic questions emerging from this reality. Drawing on the work of a cross-institutional working group, the session brings together diverse perspectives from roles involved in repository stewardship. Panelists will discuss how AI-driven harvesting affects daily operations, planning, and decision-making, and how responsibilities and constraints vary across roles, institutions, and legal contexts. By creating space for cross-role dialogue, the panel aims to advance discussion around mitigation, responsibility, and sustaining public mandates in an evolving, AI-driven internet.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Griffith et al. (Wed,) studied this question.

synapsesocial.com/papers/6a3a2217111626ef22ab6b66 https://doi.org/https://doi.org/10.5281/zenodo.20789001

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Bookmark

View Full Paper