Three-dimensional (3D) human protein structures have been predicted using artificial intelligence (AI).
A new AI tool called AlphaFold correctly predicted the 3D structure of over 98 percent of all human proteins. By predicting nearly the entire human proteome (the complete set of proteins expressed by an organism), and proteomes for a range of 20 additional model organisms, researchers from DeepMind in London have provided more than 350,000 protein structures that are publically available on a database maintained by the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI), in Cambridge.
'This will be one of the most important datasets since the mapping of the human genome,' said EMBL-EBI's director Dr Ewan Birney.
Proteins are formed when long chains of amino acids fold, origami-like, into irregular-shaped structures. Understanding the 3D protein structures can provide information to uncover a range of biological processes, such as enabling drug development so that therapeutics or diagnostics can target binding sites more accurately.
Previous approaches to understanding protein folding required precise techniques for sample preparation and were often expensive, time-consuming and relied on trial and error – meaning years of work. The AlphaFold tool uses algorithms that have been developed over years of fine-tuning to accurately predict the structure of proteins in minutes.
'We want to give experimentalists and biologists a really clear signal of which parts of the predictions they should rely on,' said Dr Kathryn Tunyasuvunakool from DeepMind who was first author of the paper published in Nature.
Notably, AlphaFold's AI tool produces predictions of protein structures, that have varying levels of accuracy. This means that further work will be required to verify the structures, although the software is able to rank how confident it is for each prediction; of the 98.5 percent of human proteins mapped, 58 percent were categorised as 'confident' predictions, of which a subset of 36 percent had 'high confidence'.
Such predictions may accelerate research discovery by pointing scientists in the right direction such as at a specific binding site on a protein. This may enable exploration into personalised medicine, making foods more nutritious or even developing enzymes that can digest plastic.
'The applications are actually limited only by our imagination – the database will increase our understanding of how proteins function... we can be better equipped to unravel the molecular mechanisms of life and accelerate our pursuits to protect and treat human health, as well as the health of our planet,' said Professor Edith Heard, the director-general of the EMBL.