General, Research, Technology

Most important discovery in 50 years: DeepMind's algorithm was taught to determine the structure of a protein

Protein is an important part of everyone's lifehuman, but despite the fact that we live in the 21st century, when neural networks paint pictures, and 3D printers are full-fledged organs, scientists have not yet had the opportunity to fully study the protein. In particular, biologists over the past 50 years have tried to determine the three-dimensional structure of a protein: if you understand it, you can find out how it interacts with other substances, including drugs. Until recently, the protein folding mechanism remained unknown, until the team of DeepMind, the Google division that creates neural networks, decided to use artificial intelligence to solve this problem.

This protein structure was created by an algorithm based on a neural network


  • 1 How to determine the structure of a protein?
  • 2 What is AlphaFold?
  • 3 Why do you need to determine the structure of a protein?
  • 4 How else can AlphaFold 2 be used

How to determine the structure of a protein?

What is the problem with determining the three-dimensional structuresquirrel? Proteins tend to take shape unaided, guided only by the laws of physics. Before that, biologists had an idea of ​​how to do this, but everything rested on time. To solve this problem, it is necessary to determine the amino acid sequence of the protein and analyze the connections between the members of this sequence. But this sequence can even consist of 101 amino acids, between which there will be, respectively, 100 bonds. Plus, each of them can have three possible states.

As a result, the final protein will have an incredibly many variants of structures - 3 to the hundredth power... To go over them all, man it will take thousands of years.

Of course, no one has that much time left, so for decades scientists have tried to solve this problem in a different way. It didn't work, before Alphafold - an algorithm that the DeepMind team developed specifically for this purpose.

What is AlphaFold?

The first version of this algorithm DeepMind showedtwo years ago. AlphaFold has proven to be more accurate than its competitors in predicting the three-dimensional structure of proteins from a list of ingredients. It is enough for a neural network to “feed” a sequence of amino acids, and at the output it will show the distance and angles of bonds between them, which allows restoring the protein structure.

The developers continued to work on the algorithm, andOn November 30, 2020, AlphaFold 2 was shown, which has become even more accurate. The idea is to consider the sequence of amino acids as a graph: its vertices are amino acid residues, and the edges are the connections between them. And then give the task of a neural network with an attention block to investigate it, taking into account already known similar and evolutionarily related proteins. After that, the algorithm builds the final three-dimensional structure of the protein from the resulting connections.

Protein structures generated by the DeepMind algorithm

But any neural network needs input data, forwhich it can rely on, in which case the scientists uploaded information on the structures of approximately 170,000 proteins. The entire learning process took several weeks - compared to the thousands of years discussed at the beginning of this article, this is a real breakthrough. The algorithm was presented at the recent CASP conference, where AlphaFold2 took first place, gaining 92.4 out of 100 possible points (based on the correctness of located amino acid residues in the protein chain). The previous version of the algorithm scored a maximum of 60 points.

Research on the accuracy of algorithms for determining the structure of a protein (more is better)

Why do you need to determine the structure of a protein?

This discovery will allow the creation of new medicinaldrugs against diseases, because with the help of the structure, scientists will know how the protein works, how it folds and interacts with other elements so that it can be used painlessly in medicines. Also, the structure of protein allows you to understand how diseases spread and affect the human body.

For example, Parkinson's disease develops due toaccumulation of the protein alpha-synuclein in the body: it curls up and forms toxic tangles inside the neurons - Lewy bodies. The latter then infect neurons in the brain. However, where exactly this protein comes from, scientists still do not know exactly. Understanding the three-dimensional structure of a protein will help answer this question.

The same goes for Alzheimer's, waythe spread of which runs through the disruption of communication between neurons, special cells that process and transmit electrical and chemical connections between areas of the brain. This leads to the death of brain cells and the accumulation of two types of protein, amyloid and tau.

The exact interaction between these two proteins inlargely unknown. One of the difficulties in diagnosing Alzheimer's disease is that we do not have a reliable and accurate way to measure these protein accumulations in the early stages of the disease.

Alphafold 2 will help diagnose Alzheimer's disease at an earlier stage and provide an opportunity to create the right medicine.

This is the most important discovery in the last 50 years -says John Moult, a biologist at the University of Maryland who co-founded CASP in 1994 with the goal of developing computational methods for accurately predicting protein structures. - In a sense, the problem is solved.

The ability to accurately predict the structure of proteins bytheir amino acid sequence would be a huge boon to medicine. This will greatly speed up research on understanding the building blocks of cells and allow faster and more efficient discover new drugs.

Subscribe to us in Yandex.Zen to get access to closed materials that are not published even on the site.

How else can AlphaFold 2 be used

AlphaFold 2 is unlikely to make unnecessarylaboratories that use experimental methods to determine the structure of proteins. But the algorithm showed that lower quality and easier to collect experimental data is all that is needed to create a good protein structure.

I thought this problem would not be solved in my lifetime, ”says Janet Thornton, a biologist at the European Laboratory of Molecular Biology.

She hopes this approach will help shed light on the function of thousands of unknown proteins in the human genome and understand the variations in disease-causing genes that occur in different people.

The creation of AlphaFold 2 also marksa turning point for DeepMind. The company is best known for using AI to master games like Go, but its long-term goal is to develop software that surpasses the capabilities of human intelligence. Solving daunting scientific problems, such as predicting the structure of proteins, is one of the most important that artificial intelligence can do. Just think what will happen next - after all, amazing discoveries await us!