Key points are not available for this paper at this time.
Sharing sequencing data sets without identifiers has become a common practice in genomics. Here, we report that surnames can be recovered from personal genomes by profiling short tandem repeats on the Y chromosome (Y-STRs) and querying recreational genetic genealogy databases. We show that a combination of a surname with other types of metadata, such as age and state, can be used to triangulate the identity of the target. A key feature of this technique is that it entirely relies on free, publicly accessible Internet resources. We quantitatively analyze the probability of identification for U.S. males. We further demonstrate the feasibility of this technique by tracing back with high probability the identities of multiple participants in public sequencing projects.
Building similarity graph...
Analyzing shared references across papers
Loading...
Melissa Gymrek
University of California, San Diego
Amy L. McGuire
Broad Institute
David E. Golan
A&G Pharmaceutical (United States)
Science
Massachusetts General Hospital
Baylor College of Medicine
Broad Institute
Building similarity graph...
Analyzing shared references across papers
Loading...
Gymrek et al. (Thu,) studied this question.
synapsesocial.com/papers/6a07fd30f1d046f829735f63 — DOI: https://doi.org/10.1126/science.1229566