Home / Blog / Personal Digital Twins: Data Sharing

Personal Digital Twins: Data Sharing

A possible classification of the level of sharing of personal data. Image credit: Juan Luis Herrera et al, University of Extremadura, Hsiao-Yuan Chen et al, University of Texas at Austin, Niko Mäkitalo et al, University of Helsinki

The PDT is a model of a person. Part of the model consists of data (and part of the “interpretation” of those data to manage interactions mirroring the behaviour of the person). Over time the set of data will become broader covering many aspects of the person and will include the “history” of the person, like the EHR when modelling health aspects or the evolution of the knowledge space when modelling the education and working experience.

These data are clearly crucial to the  operation of the PDT (a PDT without data does not exist). At the same time they may be useful to third parties, including other PDTs interacting with it. Sharing of these data can also become beneficial to the person and in some cases they can result in revenues. On the other hand, privacy and ownership issues may require to restrict/limit the sharing.

The EU has in place a number of directives, and regulations, on the use of data, from the protection of data -GDPR- to the EU Data ACT aiming at making more data available to societal and business benefit (in a way you can see the GDPR and the Data ACT as looking at the opposite needs to protect/hide on one side and to leverage/share data on the other).

The experts’ group on PDT discussed the need for a specific regulation that could provide a framework steering the development and interoperability of PDTs among them and with third parties applications.

The classification shown in the opening image was proposed:

  • Gentrification: no data sharing at all. The owner of the data is managing them and uses them only for the reason they have been collected in the first place. The owner of the PDT (the person) will, as a matter of fact, delegate the management of data to a service provider with the agreement that those data cannot be used for anything else than the personal benefit of the owner (this includes their use to provide the required services),
  • Accessible: the data harvested by the data manager (the service provider) have to be notified to the person whose those data refer to. As an example my PDT can be “enriched” by data resulting from the tracking of my activity on the web (search, e-commerce, entertainment) and I should be made aware of the existence of those data.
  • Traceable: (some of my) data are shared with third parties for specific reasons and I am notified both of what data are shared, for what reason, when and to whom.
  • Tradable: I, as owner, have the possibility to negotiate what data can be shared for what at what “price”.
  • Linkable: data can be analysed in the context, linked together to generate further data (meta-data). This is the result of applying data analytics to various sets/streams of data. The resulting metadata provide a higher level of information about myself (as an example, linking the data of my physiology -fever, cough, heartbeat, shallow rapid breathing, with the ones of my whereabouts showing potential contacts with contagious people can lead to an assessment of myself high risk of being affected by Covid).
  • Enfranchised: my data (or most likely a part of them) are freely available for societal benefit (as an example I might decide to provide free access to my genome sequence to researchers).

What presented is just one possible way of structuring the data sharing. It is obvious that some data will be shared more freely, others will be kept private, depending on their meaning and their value to the person. However, a European (more generally an institutional) framework may impose certain degree of sharing (access) to specific data by specific parties. In any case the person who is the owner of those data (in the sense that those data are about her) needs to be notified of the sharing/access and intended use.

The issue is not easy at all. As an example, my image is picked up by security cameras many times a day. it is also picked up serendipitously by other people who are taking a picture of a monument when I happen to be in the frame… How do we protect these data, how can we be notified that these data “exist”? Notice that the issue is becoming more and more important as image recognition (face recognition) software becomes a commodity (you can search on Google using an image and it is more and more likely that you will get results showing that same face in other photos. From there it might be quite easy to identify a person…).

If I appear in a photo and that photo becomes part of the data of another person PDT, what kind of right do I have on that? It might be close to impossible to be aware of my presence (presence of my data)  in other PDTs. Of course this is a much more general problem, the emergence of PDTs and embedded AI will just make it more sensitive and impactful.

About Roberto Saracco

Roberto Saracco fell in love with technology and its implications long time ago. His background is in math and computer science. Until April 2017 he led the EIT Digital Italian Node and then was head of the Industrial Doctoral School of EIT Digital up to September 2018. Previously, up to December 2011 he was the Director of the Telecom Italia Future Centre in Venice, looking at the interplay of technology evolution, economics and society. At the turn of the century he led a World Bank-Infodev project to stimulate entrepreneurship in Latin America. He is a senior member of IEEE where he leads the New Initiative Committee and co-chairs the Digital Reality Initiative. He is a member of the IEEE in 2050 Ad Hoc Committee. He teaches a Master course on Technology Forecasting and Market impact at the University of Trento. He has published over 100 papers in journals and magazines and 14 books.