Alphafold

How AlphaFold Revolutionized Protein Structure Prediction

Quick Summary

  • Proteins are vital to all processes in life.
  • AlphaFold is a deep-learning, artificial intelligence system developed by Google DeepMind that predicts a protein’s 3D structure.
  • AlphaFold has not replaced experimental biology, but it can fundamentally change the efficiency of it. It highlights an exciting future at the intersection of artificial intelligence and biology.

Proteins are vital to all processes in life. They are a chain of smaller units: amino acids. The sequence of these amino acids is encoded in DNA, but a protein’s function is heavily dependent on its 3D structure. This amino acid chain is able to fold upon itself with many complex twists, turns and tangles. Even small changes to this structure can dramatically alter its behaviour. In fact, many human diseases are caused by detrimental changes to protein folding. 

Over decades of experimental effort, scientists have determined around 100,000 protein structures (by 2021). This has been achieved by X-ray crystallography, nuclear magnetic resonance (NMR) and cryo-electron microscopy. While powerful, these methods are expensive and time-consuming, requiring months to years to uncover a single protein structure. As a result, 100,000 is only a small fraction of the hundreds of millions of proteins known to exist. 

In nature, proteins fold and refold in milliseconds. However, based on the amino acid sequence, a protein could have an astronomical amount of possible conformations, more than could ever be tested one-by-one. This is where computational methods come in to make strides. 

AlphaFold is a deep-learning, artificial intelligence system developed by Google DeepMind that predicts a protein’s 3D structure. Rather than simulating every physical interaction, AlphaFold learns to spot patterns from vast datasets of known protein structures, gaining insights from evolutionary relationships and the geometric and physical constraints. It looks for amino acids that often end up close together in folded structures. It then learns to guess the distance between pairs of amino acids and the angles between chemical bonds of those amino acids. Using this information, it makes a prediction on the final structure of the protein sequence. Another module estimates how good this proposed structure is. Then, it makes iterative improvements to reduce errors before settling on the most accurate structure. 

AlphaFold2 made its breakthrough in 2020 at the 14th Critical Assessment of Structure Prediction (CASP14): a biennial double-blind challenge considered the gold standard of assessing the accuracy of protein structure prediction methods. CASP was co-founded by UC Davis Genome Center’s Dr Krzysztof Fidelis and Professor John Moult (University of Maryland). At CASP14, AlphaFold2 outperformed other methods by a large margin, achieving a median accuracy score comparable to experimental techniques.

Following CASP14, DeepMind partnered with the European Molecular Biology Laboratory–European Bioinformatics Institute (EMBL-EBI) to launch the AlphaFold Protein Structure Database. Initially covering the human proteome, the database has expanded to include all the hundreds of millions of catalogued proteins known to exist. 

AlphaFold, like other AI models, is not without its limitations. While it has an unprecedented accuracy for two-thirds of its CASP14 predictions, we won’t know exactly which of its predictions are accurate until compared with experimental solutions. AlphaFold2 also struggled more with protein complexes (collections of proteins that work together) and Intrinsically Disordered Proteins (proteins that have regions that are dynamic and flexible but still fully functional). AlphaFold 3, released in 2024, extends its predictive powers to all other classes of molecules and the complexes formed between them. However, it hasn’t yet matched the accuracy that it has with single proteins. 

It is also important to note that knowing the structure of the protein doesn’t give away the full story of its function. There are proteins with similar folds that have different chemical activity, and conversely, different folds can sometimes carry out similar functions. Additionally, a static picture of a protein’s structure isn’t particularly informative when its dynamicity is vital to its function. We know that enzyme-substrate interactions do rely on some adapting and flexibility on the protein’s part. 

Even with these caveats, AlphaFold has already established itself as an invaluable tool. Its predictions are useful starting points for refinement with experimental data and have been able to help accelerate research projects across the world. AlphaFold helped Matthew Higgins’ lab uncover the structure of a critical surface protein on the malaria parasite, pushing forward their work in developing a malaria vaccine. Using AlphaFold, Berkley Walker’s lab was able to understand a plant enzyme and how it reacts to heat. Now, the lab is testing growing plants with hybrid enzymes that are more resistant to heat, potentially allowing plants to adapt in a warming world. For Zachary Berndsen and Keith Cassidy, AlphaFold allowed them to map apoB100, a large, complex protein that forms the molecular scaffold of “bad” cholesterol: a key risk factor for cardiovascular disease. With this knowledge, they will be able to better understand how cholesterol becomes harmful and to develop ways to prevent and treat heart disease. 

AlphaFold has not replaced experimental biology, but it can fundamentally change the efficiency of it. It highlights an exciting future at the intersection of artificial intelligence and biology. The next generation of scientists cannot just study the two in isolation, given that understanding life is now easier when we teach machines how to understand it too. 


Sources:

https://alphafold.ebi.ac.uk/ 

https://www.nature.com/articles/s41586-021-03819-2 

https://deepmind.google/science/alphafold/ 

https://www.technologyreview.com/2020/11/30/1012712/deepmind-protein-folding-ai-solved-biology-science-drugs-disease/ 

https://deepmind.google/blog/alphafold-using-ai-for-scientific-discovery-2020/ 

https://www.science.org/content/article/game-has-changed-ai-triumphs-solving-protein-structures   

https://occamstypewriter.org/scurry/2020/12/02/no-deepmind-has-not-solved-protein-folding/  

https://deepmind.google/blog/alphafold-a-solution-to-a-50-year-old-grand-challenge-in-biology/ 

https://laskerfoundation.org/winners/alphafold-a-technology-for-predicting-protein-structures/  

https://genomecenter.ucdavis.edu/blog/uc-davis-genome-center-spotlight-casp-and-nobel-prize-winning-breakthrough-alphafold  

https://www.bbc.com/news/science-environment-55133972  

https://moalquraishi.wordpress.com/2020/12/08/alphafold2-casp14-it-feels-like-ones-child-has-left-home

https://www.chemistryworld.com/opinion/behind-the-screens-of-alphafold/4012867.article 

https://www.science.org/content/blog-post/alphafold-3-debuts   

https://www.nature.com/articles/s41586-024-07487-w  

https://deepmind.google/blog/engineering-more-resilient-crops-for-a-warming-climate/  

https://deepmind.google/blog/stopping-malaria-in-its-tracks/ 

https://deepmind.google/blog/revealing-a-key-protein-behind-heart-disease/ 

Primary Category

Secondary Categories

Human & Animal Health