What question did this study set out to answer?

The aim is to investigate the vulnerabilities in the linking of sensitive data between organizations and propose solutions.

March 28, 2026Open Access

Information leakage in the practical linking of sensitive data: Parties, protocols, and adversaries

Puntos clave

The aim is to investigate the vulnerabilities in the linking of sensitive data between organizations and propose solutions.
Describe the end-to-end data linkage process and its protocols
Identify types of parties involved in data linkage
Analyze the sensitive information each party can access
Examine the motivations of adversaries seeking to exploit data
Current privacy-preserving record linkage protocols still lead to unintentional information leakage.
Recommendations are provided to enhance security in data linkage projects.
Organizational and human factors impacting data linkage techniques have been insufficiently addressed.

Resumen

The process of linking databases that contain sensitive information about individuals across organisations is an increasingly common requirement in the health and social science research domains, as well as with governments and businesses. The lack of unique entity identifiers means that linking often has to rely on personal details such as names and addresses. Data linkage protocols have been proposed to limit the leakage of sensitive personal information, while privacy-preserving record linkage (PPRL) techniques have been developed to conduct linkage on encoded data. While PPRL techniques are now being employed in real-world applications, the focus of PPRL research has been on the technical aspects of linking sensitive data, such as encoding methods and cryptanalysis attacks. Organisational and human challenges when employing such techniques in practice, however, have not been studied adequately. In this paper, we describe the end-to-end data linkage process and formalise two fundamental types of linkage protocols. We describe the types of parties that participate in such a protocol, and analyse what sensitive information each party can learn from the data it obtains legitimately within the protocol. We also discuss the possible motivations and objectives of an adversary who aims to learn sensitive information from the databases being linked, and show that current PPRL protocols still result in the unintentional leakage of sensitive information. We provide recommendations to help data custodians and other parties involved in data linkage projects to identify and prevent vulnerabilities and make their projects more secure.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo