Israeli Scientists Say Public Genealogy Databases Can Be Used to Identify Everybody

“With You, I can rush a barrier; with my God I can scale a wall;” Psalms 18:30 (The Israel Bible™)

Yaniv Erlich, Tal Shor, Itsik Pe’er, and Shai Carmi, who are affiliated with online genealogy platform MyHeritage, as well as Columbia University, The Hebrew University of Jerusalem, and the New York Genome Center, last Thursday published a report in Science Magazine (Identity inference of genomic data using long-range familial searches) suggesting that as DNA databases continue to grow, investigators will be able to identify anyone in the US given a sample of their DNA.

Consumer genomics databases have reached the scale of millions of individuals, say the authors of the report, noting that recently, law enforcement authorities have exploited some of these databases to identify suspects via distant familial relatives.

“Using genomic data of 1.28 million individuals tested with consumer genomics, we investigated the power of this technique. We project that about 60% of the searches for individuals of European-descent will result in a third cousin or closer match, which can allow their identification using demographic identifiers,” the report states.

“Moreover, the technique could implicate nearly any US-individual of European-descent in the near future. We demonstrate that the technique can also identify research participants of a public sequencing project. Based on these results, we propose a potential mitigation strategy and policy implications to human subject research,” the report says.

In one notable case, according to the report, law enforcement used a long-range familial search to trace the Golden State Killer. Investigators generated a genome-wide profile of the perpetrator from a crime scene sample and uploaded the profile to GEDmatch ~1 million DNA profiles. The GEDmatch search identified a 3rd-degree cousin. Extensive genealogical data traced the identity of the perpetrator, which was confirmed by a standard DNA test.

Between April to August 2018, at least 13 cases were reportedly solved by long range familial searches. Most of these investigations focused on cold cases, for which decades of investigation failed to identify the offender. Nonetheless, one case involved a crime from April 2018, suggesting that some law enforcement agencies have incorporated long-range familial DNA searches into active investigations.

MyHeritage is an online genealogy platform with web, mobile, and software products and services that was first developed and popularized by the Israeli company MyHeritage in 2003. The company is headquartered in Or Yehuda, Israel with additional offices in Tel Aviv, Lehi, Utah, Kiev, Ukraine, and Burbank, California.

Users of the platform can create family trees, upload and browse through photos, and search billions of global historical records, among other features. As of 2015, the service supports 42 languages and has around 80 million users worldwide. In January 2017 it was reported that MyHeritage has 35 million family trees on its website.

The report authors warn that “while policymakers and the general public may be in favor of such enhanced forensic capabilities for solving crimes, it relies on databases and services that are open to everyone. Thus, the same technique could also be exploited for harmful purposes, such as re-identification of research subjects from their genetic data.”

Overall, the report suggests that clear policies for law enforcement in using long-range familial searches, and respecting the autonomy of participants in genetic studies, are necessary components for long-term sustainability of the genomics ecosystem.

Subscribe to our mailing list