An artificial intelligence (AI) model has been developed that is capable of decoding genomes and designing synthetic DNA, RNA and proteins.
This deep learning model, named Evo, was trained on approximately 2.7 million microbial genomes, including the genomes of microbes such as prokaryotes and phages that are part of the same evolutionary lineage. To ensure safety, no viruses or bacteria that could pose a threat to human health were included in the training data, as there were concerns that Evo could potentially be hijacked online to create a dangerous pathogen.
'Evo deciphers the patterns written into DNA over billions of years of evolution, breaking new ground in our ability to understand and engineer biology. Just as generative AI has revolutionised how we work with text, audio, and video, these same creative capabilities can now be applied to life's fundamental codes,' said Dr Patrick Hsu from the Arc Institute, California, and corresponding author of the paper published in Science.
Unlike previous biological models that relied on large language models treating DNA bases as words in a language's grammar, Evo used larger datasets consisting of multiple genomes. This allowed Evo to analyse the genomes of related microbes and understand the differences arising from evolutionary changes. Notably, Evo achieved a higher resolution of single nucleotides compared to earlier models that concentrated on three-nucleotide sections known as codons. While higher resolution typically demands longer computational processing times, Evo's architectural framework efficiently compressed the data.
To evaluate Evo's accuracy, researchers conducted a 'zero-shot evaluation', asking Evo to analyse and predict the effects of modifications to previously unseen DNA sequences. In just minutes, Evo accurately predicted changes to the proteins encoded by the DNA, a task that separate experimental analyses took years to complete.
Additionally, Evo was employed to design a synthetic CRISPR genome editing system that can pinpoint and cut DNA at specific locations. This process involved creating both the DNA sequences responsible for the proteins that make the cuts and the RNA that guides these proteins to the target DNA. The synthetic CRISPR system generated was found to be equally as active as a naturally-occurring counterpart even though it shared only 73 percent of the same DNA. The authors suggest that there may be more effective CRISPR genome editing systems that AI could discover.
Evo can generate synthetic DNA strands of up to one million bases in length and the researchers now aim to scale Evo to study and analyse more complex biological systems.
Dr Christina Theodoris, from the Gladstone Institutes and the University of California, San Francisco, who was not involved in the research, suggests Evo could have major implications for humans in a perspective article in Science.
'The ability to predict the effects of mutations across all layers of regulation in the cell and to design DNA sequences to manipulate cell function would have tremendous diagnostic and therapeutic implications for disease,' commented Dr Theodoris.
Sources and References
-
Evo: Creating generative AI for genomes
-
Sequence modeling and design from molecular to genome scale with Evo
-
Meet Evo, the DNA-trained AI that creates genomes from scratch
-
Learning the language of DNA
-
Evo AI model decodes and engineers genetic sequences, acting as biological 'Rosetta Stone'
-
Evo - an AI-based model for deciphering and designing genetic sequences
Leave a Reply
You must be logged in to post a comment.