A novel artificial intelligence (AI) tool has been developed to predict the molecular and regulatory effects of genetic variants relevant to health and disease.
The human genome consists of around three billion DNA base pairs, with about two percent of the genome encoding proteins. The remaining 98 percent of the genome is noncoding DNA, which includes sections vital for regulating when, where, and how different genes are expressed. Now, scientists at Google's DeepMind have developed AlphaGenome, an AI tool that has been trained to predict the effects of specific genetic mutations by analysing DNA sequences of up to a million base pairs long, including within noncoding DNA.
'Ever since the human genome was sequenced, people have been trying to understand the semantics of it – this has been a longstanding goal for DeepMind,' Dr Pushmeet Kohli, head of AI for science at DeepMind and a coauthor of the new study published in Nature, told Scientific American. 'It's like you have a huge book of three billion characters and something wrong happened in this book... AlphaGenome can be used to say, "If you change these words, what would be the effect?"'
AlphaGenome was trained using genetic databases containing human and mouse genomes, comparing the effects of DNA changes that cause mutations with those of unedited DNA. By altering a single base pair in that sequence, AlphaGenome can determine how the molecular properties of the genome involved in gene expression are affected by the genetic mutation.
Previous AI models were limited by a compromise between the maximum length of a genetic sequence they could analyse and the required resolution of a base pair. The authors explain the importance of analysing long genetic sequences: a genetic mutation caused by a change in a DNA base in one location could affect a far-removed section of the genome, even up to around a million bases away. This is because DNA has a three-dimensional structure; as a result, some points in a genetic sequence that are millions of bases apart may occupy similar positions in space.
'DeepMind's AlphaGenome represents a major milestone in the field of genomic AI. This level of resolution, particularly for noncoding DNA, is a breakthrough that moves the technology from theoretical interest to practical utility, allowing scientists to programmatically study and simulate the genetic roots of complex disease,' said Dr Robert Goldstone, head of genomics at the Francis Crick Institute, London, who was not involved in the study.
It is thought that certain diseases, such as heart disease, mental health disorders and many cancers, are partly caused by genetic mutations in noncoding DNA. The team hope that predicting the effects of specific mutations and their role in the development of certain diseases, could lead to the development of more targeted and effective treatments.
The authors conclude: 'AlphaGenome provides a powerful and unified model for analysing the regulatory genome. It advances our ability to predict molecular functions and variant effects from DNA, offering valuable tools for biological discovery and enabling applications in biotechnology'.
DeepMind previously developed AlphaFold, a tool that predicts protein shape from its amino acid sequence (see BioNews 1105), which later earned DeepMind cofounder Sir Dr Demis Hassabis to be jointly awarded the 2024 Nobel Prize in Chemistry (see BioNews 1260). AlphaFold has been used to show how vertebrate sperm and egg join (see BioNews 1261).
Sources and References
-
AlphaGenome: AI for better understanding the genome
-
Advancing regulatory variant effect prediction with AlphaGenome
-
AI model from Google's DeepMind reads recipe for life in DNA
-
Google DeepMind unleashes new AI to investigate DNA's 'dark matter'
-
Google DeepMind launches AI tool to help identify genetic drivers of disease
-
Designer DNA could revolutionise cures for diseases


