Home / Blog / A pervasive Family Tree

A pervasive Family Tree

Genome mapping created by the Compact Chromosome Browser. Image credit: GEDMatch3

Some 15 years have gone by from the complete sequencing of a human genome (April 14th, 2003) and we now have around 1 million human genomes sequenced (there are many companies in the sequencing space and it is difficult to have an accurate figure; besides not all genomes are completely sequenced and there are even different opinions on what is meant by “complete”!). Last year, on April 25th 2018 -the DNA day marking the anniversary of the double helix discovery in 1953, the Brown Institute announced their sequencing of the 100,000 human genome and the availability of some 70 PB of genomic data.

More and more people are getting their genome (partially/ sequenced through companies like TellMeGen, Ancestry, MyHeritageDNA, 24 Genetics, 23andMe … and the results are often shared on open data platforms like GEDMatch.

Now, this gets interesting. These open platform provide raw data that can be used by a variety of applications to derive meaning.

A recent article on Wired provides an interesting example of this meaning-mining. By looking at the data an application can create a sort of (genetic) blueprint of a family tree. If your mother/father or brother/sister have shared their genome on that platform they also, implicitly, shared (a part of) your genome. This common genetic blueprint is what makes the family tree. Now imagine that the police is investigating a crime scene and they harvest some DNA that might have been left by the perpetrator. That DNA would be part of a family tree and if anyone in that family tree has shared their genome there will be a family tree matching the one of the perpetrator. Genome sequencing is quite new, so a family tree today will be generated by 1 or 2 generations but in the second half of this  century we may reasonably expect that most family trees in the world will become available. That means that everybody could be traced to her/his family tree.

As pointed out in the Wired article, this is also generating some privacy concerns that go beyond being associated to a crime. Your family tree may show a higher risk of cardiovascular diseases, that would that increase your medical insurance premium (or make you ineligible to get medical insurance)?

All kind of biases may surface as result of this cross checking. This is even more of concern considering that the genotype does not necessarily imply a specific phenotype (i.e. if your father had cardiovascular problems it does not follow you will suffer the same problems!).

The Digital Transformation will involve most areas of business and impact our society in ways we are yet to discover. The mass sequencing of the genome will be an important component of that.

About Roberto Saracco

Roberto Saracco fell in love with technology and its implications long time ago. His background is in math and computer science. Until April 2017 he led the EIT Digital Italian Node and then was head of the Industrial Doctoral School of EIT Digital up to September 2018. Previously, up to December 2011 he was the Director of the Telecom Italia Future Centre in Venice, looking at the interplay of technology evolution, economics and society. At the turn of the century he led a World Bank-Infodev project to stimulate entrepreneurship in Latin America. He is a senior member of IEEE where he leads the New Initiative Committee and co-chairs the Digital Reality Initiative. He is a member of the IEEE in 2050 Ad Hoc Committee. He teaches a Master course on Technology Forecasting and Market impact at the University of Trento. He has published over 100 papers in journals and magazines and 14 books.