In the modern era, the near impossibility of true anonymization means we must provide tangible recommendations for researchers who need to share de-identified, person-level data that could potentially be re-identified due to the presence of quasi-identifiers. This calls for data stewards to support researchers in depositing sensitive data in public repositories while still following institutional, ethical, and legal requirements. While various repository aggregators like re3data and DataCite Repository Finder provide lists of data repositories, navigating these can be cumbersome when trying to locate options for depositing restricted data. These listings rarely include certain necessary details, making the process of recommending third-party repositories to researchers time-consuming — or even limited, and we often end up relying on a short list of well-known repositories. An additional challenge is the difficulty of identifying repositories that mediate access via data usage agreements, where the repository handles access requests to ensure potential users meet established security and privacy requirements and have taken the necessary steps to protect confidentiality and commit to appropriate data use. The need to provide tangible recommendations to help researchers deposit data in public repositories while still protecting individual privacy served as the inception to this project to identify and create a spreadsheet of restricted data repositories with mediated access processes for researchers. This practical solution empowers data sharing while upholding essential ethical and institutional privacy requirements and, while currently limited to US based social sciences repositories, in sharing this resource, we hope others will continue to contribute and expand this work.
Oberlies et al. (Thu,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: