Home / Blog / AI is great but who is training AI?

AI is great but who is training AI?

The three-body problem has engaged physicists and astronomers till Newton. Calculating the motion of three mutually attracting bodies is a challenging and lengthy process. In the graphic 13 possible solutions to a three body problem. Image credit: Milovan Šuvakov and Veljko Dmitrasinović /university of Belgrade

One way to develop artificial intelligent solutions is to train deep neural networks with an existing batch of data (and related solutions) so that the algorithm can “learn”. Once the algorithm has learnt it can be applied to new sets of data (in the same domain).

A recent example of this approach was used by researchers at Cambridge University and it is reported in a paper aptly titled: Newton vs the machine: solving the chaotic three-body problem using deep neural networks.

The researchers trained the AI software using the data and results from 9900 calculations performed by Brutus, a number crunching application that can approximate the solution to three-body problems (determining the motion of three-point mass bodies under only their gravitational interactions) given a specific starting condition . The number crunching requires quite a bit of time—and the more precise you want the approximation to be, the more computation time is required. The problem is chaotic in nature and, at least so far, it was not possible to find a general “formula” to arrive at a solution. What one does is to iterate the computation many, many times, looking at movements of the three body over a very short time and distance. (The shorter the time factor, the more accurate their positions.) You then iterate again till you reach the target time.

The approach using neural network is tremendously faster—1 million times faster, according to the Cambridge researchers—because the neural networks look at patterns rather than calculating step by step small increments. So far, the results produced show that neural networks can be as accurate as the Brutus number crunching approach. (The researchers tested the results of the neural networks against the one produced by Brutus using new data sets).

There are a few that don’t consider this result as outstanding, noting that a real solution to the three-body problem has not been found, and that researchers simply found a way to reach approximation in a shorter time (much, much shorter, I have to say) . Machiavelli, I suppose, would beg differently (what matters is the result, not the means).

I would support this latter view and I appreciate the feat achieved by the Cambridge researchers. However, this is not the reason why I decided to report this news here. I find that the current approach to AI—based on leveraging huge data sets and creating software that can learn from those data sets—is shifting the focus from AI as such to the ownership (or access) to data sets. This has been the source of concern at several levels, most notably in Europe since the current landscape see an aggregation of data in the US (and in China—but those data are mostly geographically collocated).

We have also seen concerns on the bias that can be associated to data sets resulting in an AI that is biased—and so far an outstanding solution has not been found.

Clearly these two aspects are more important (at least in my view) than the discussion on the validity of an approximation approach rather than an algorithmic solution, as in the news on the three-body problem.

In any case, as long as we are limited to using large sets of data to fuel machine learning, we should try to do the best with the data set we have—and this brings me to the observation that IEEE probably has the largest technology data sets in the world and, most importantly, these data sets are peer-reviewed .  Notice that when we are saying “peer-reviewed” we are saying that the overall data already embed a level of human intelligence (the reviewers)—and feeding this into a machine learning process means harvesting the scattered intelligence into a more comprehensive meta-intelligence. In other words, IEEE might be in the ideal condition to train AI in the area of technology, technology evolution and technology application by feeding the data (articles, conferences recordings, courses …) to a machine learning program. This is something that is now being considered (which is a far cry from saying that it is being exploited) in the Digital Reality Initiative, looking at creating Knowledge as a Service out of the IEEE data. We’ll see in the coming years if this idea can be turned into reality—and I certainly find it very exciting and worth pursuing, and so I am looking forward to your reactions and comments.

First comments received from Stuart Dambrot are included in the text, and gratefully acknowledged

About Roberto Saracco

Roberto Saracco fell in love with technology and its implications long time ago. His background is in math and computer science. Until April 2017 he led the EIT Digital Italian Node and then was head of the Industrial Doctoral School of EIT Digital up to September 2018. Previously, up to December 2011 he was the Director of the Telecom Italia Future Centre in Venice, looking at the interplay of technology evolution, economics and society. At the turn of the century he led a World Bank-Infodev project to stimulate entrepreneurship in Latin America. He is a senior member of IEEE where he leads the Industry Advisory Board within the Future Directions Committee and co-chairs the Digital Reality Initiative. He teaches a Master course on Technology Forecasting and Market impact at the University of Trento. He has published over 100 papers in journals and magazines and 14 books.