Home / Blog / Designing proteins using AI

Designing proteins using AI

Using Ai to generate synthetic proteins. The image renders, in a very schematic way, the molecular shape of a protein with artificial hands, provided by AI, for its modification. Image credit: Pixabay/Yen Strandqvist, Chalmers University of Technology Credit: Pixabay/Yen Strandqvist, Chalmers University of Technology

Proteins are very complex molecules that are at the bases of life. The DNA is an instruction set used by the cell for manufacturing them, an error in the manufacturing instruction usually leads to a non-functioning protein with side-effects on our health. Proteins play a role in many infections and diseases and drugs often are based on proteins. Hence, being able to identify the right protein for a given task is crucial.

So far proteins have been “discovered”. They were already out there, somewhere, and the goal of researchers was to find the right one.

A computer renderin of the Covid-19 spike protein. It is the specific shape of this protein that allows the virus to penetrate the cell membrane and infect us. The protein takes on two different shapes, called conformations–one before it infects a host cell, and another after infection. This structure represents the protein before it infects a cell, called the prefusion conformation. Image credit: University of Texas at Austin

Before progressing it is important to remember that proteins do their job by interacting with other molecules (including other proteins) and this interaction does not depend on what composes the protein, i.e. the atoms that forms the molecule, but by the “shape” of the molecule. It is a bit like playing with Lego bricks: the important thing for connecting two Lego bricks is their shape. You could connect one made of plastic with one made of steel as long as their shapes are mirroring images fitting one another. Same story for the protein.
Actually, the shape of a protein depends on the atoms it contains and the way they are disposed. This shape is tremendously complex. In the figure the shape of the Covid-19 spike protein that has become part of our everyday talks.

Since it is the shape that matters, and the shape is the result of the way the thousands (hundred of thousands) of atoms are interacting with one another, researchers have to find a protein with a shape that can lock onto another protein, as an example in the case of Covid-19 to block it. Whereas in the past this required researchers to search and test proteins that could have the right shape, in the last few years researchers have been using software that helps them to see the shape of protein and most recently to design proteins from scratch to obtain a given shape.

An approach to create new proteins is to start from an existing one and then introduce some -basically random- changes and see what happens to the shape of the protein, if -by chance- one gets what she’s looking for. A very tedious and inefficient approach leading to the creation of millions of variants and the need to test each of them resulting in several months of effort (and high cost), but that is what we had available.

Researchers at Chalmers University of Technology in Sweden have developed an AI assisted system that  leverages on past experience to move from a shape designed on a computer to an actual protein in a matter of a few weeks. The AI system creates the protein as a digital copy and then test it in thee digital space, using the same AI technologies that is being used to create artificial objects in photo editing. One side of the system creates the protein and the other side tries to spot if that protein is a fake or not. Once the test is passed (meaning that the fake is so similar to a real protein to fool the checking system) the protein is manufactured and as its digital copy managed to fool the AI system so it is most likely that it will be able to fool the cell it is being targeted, in other terms it is working.

This result is expected to be applied to a variety of sectors, from healthcare to industrial processes and can -eventually- lead to find proteins that are better than the ones that have been created through evolution.

About Roberto Saracco

Roberto Saracco fell in love with technology and its implications long time ago. His background is in math and computer science. Until April 2017 he led the EIT Digital Italian Node and then was head of the Industrial Doctoral School of EIT Digital up to September 2018. Previously, up to December 2011 he was the Director of the Telecom Italia Future Centre in Venice, looking at the interplay of technology evolution, economics and society. At the turn of the century he led a World Bank-Infodev project to stimulate entrepreneurship in Latin America. He is a senior member of IEEE where he leads the Industry Advisory Board within the Future Directions Committee and co-chairs the Digital Reality Initiative. He teaches a Master course on Technology Forecasting and Market impact at the University of Trento. He has published over 100 papers in journals and magazines and 14 books.