Creating images from text

This image was generated by AI as result of this text: “A photograph of a bird wearing headphones and speaking into a high-end microphone in a recording studio.”. Image credit: Google Parti

Natural language understanding through AI is now a reality. Both spoken and written languages can be understood by AI, Google translator added a further 24 languages in Spring 2022 reaching a total of 133 languages. Among the new added languages Quechua, Guarani and Aymara, spoken by indigenous peoples in South America. We are pretty close to the possibility to understand, and be understood by any person on the planet.

The understanding of images is also making significant progress and the translation of images into text or text into images is also on the horizon. This latter, conversion of a text description into an image is now tackled by several software leveraging on artificial intelligence. I have already discussed some in this blog.

Here is a new software, Parti – Pathways Autoregressive Text-to-Image, by Google, that is using artificial intelligence to create photorealistic images, like the one I showed.  You can get many more at the link.

Parti uses a model that has been trained on a huge set of parameters. Results are shown from a model based on 350 million parameters, 750 million, 3 billion and 20 billion. As you can expect, the more parameters used in the training the better the result. Look at the different results here.

There are basically two conclusions that can be drawn from this news:

  1. artificial intelligence has reached a level, in text to image translation, that is as good as the one achievable by a human (the quality, if you consider the photorealistic image produced is actually well beyond the capability of most humans!)
  2. to achieve this level of results you need a huge data set, something that only very few companies can have (an afford to have).

The creation of images from text can also be used to generate “art works”. DALL-2 is such an example and there are several others, all leveraging on GAI, Generative Artificial Intelligence. These “creations” are rooted on the analyses of millions images created by real artists and the AI takes hints from those to create “your” masterpiece. Actually it is really yours since you have the right to sell it (this is the case with DALL-2).

Now the question is if this is fair with respect to those artists that provided AI with “inspiration” or not. An interesting discussion, interesting because it shows that new issues arise with AI, you would never have expected, can be found here.

About Roberto Saracco

Roberto Saracco fell in love with technology and its implications long time ago. His background is in math and computer science. Until April 2017 he led the EIT Digital Italian Node and then was head of the Industrial Doctoral School of EIT Digital up to September 2018. Previously, up to December 2011 he was the Director of the Telecom Italia Future Centre in Venice, looking at the interplay of technology evolution, economics and society. At the turn of the century he led a World Bank-Infodev project to stimulate entrepreneurship in Latin America. He is a senior member of IEEE where he leads the New Initiative Committee and co-chairs the Digital Reality Initiative. He is a member of the IEEE in 2050 Ad Hoc Committee. He teaches a Master course on Technology Forecasting and Market impact at the University of Trento. He has published over 100 papers in journals and magazines and 14 books.