Google has joined forces with Californian start-up company, DNAnexus, to maintain a public DNA database online. The move follows an announcement by the US National Institutes of Health (NIH) that it may have to withdraw funding from the current public database, the Sequence Read Archive (SRA), due to funding cuts.
The SRA contains an archive of short sections of genetic data decoded by sequencing machines. But despite being an invaluable resource for scientists to better understand how genetics affect health, some researchers have deemed the database difficult to use and badly organised.
'As a public repository of unique DNA sequencing data, the SRA has been an invaluable resource to the research community. However, the ever increasing size of datasets being submitted and the need to easily integrate them into downstream analyses has tested the limits of its utility', said Dr Richard Myers, Director and Investigator of the HudsonAlpha Institute for Biotechnology, Alabama.
Although the NIH has recently change its approach and indicated it may continue funding the SRA database, DNAnexus chief executive Andreas Sundquist said they wanted to make sure 'we had a Plan B'.
The new site, which will be hosted on Google Cloud Storage, will mirror the SRA but hopes to offer a longer term solution to DNA storage, while also making the facility simpler to use. Unlike the SRA, which is free to use, after 30 days of usage the DNAnexus database will charge $10 per gigabyte of data downloaded. Improved browsing options and search facilities are also features of the new site.
Sundquist hopes DNAnexus' more accessible system will nevertheless 'help to ensure that scientists can easily access an archive of genomics information in a hosted environment that allows them to focus on science, not software'.
'DNA sequencing becomes 10 times cheaper every 18 months thanks to hardware improvements', he said, adding he believes that genetic analysis will become as commonplace as routine laboratory tests. Sundquist founded the company two years ago whilst working on a PhD in computer science.
Krishna Yeshwant, a partner at Google Ventures and board member of DNAnexus, further explained that 'the decreased cost of gene sequencing is making it possible for genomics to move out of the research lab and into clinical settings'.
Just last year it cost around $30,000 to sequence one person's DNA. This year that figure has decreased considerably to around $4,000. It is, however, managing the vast amounts of data that is still proving to be expensive. DNAnexus believes it will be well equipped to manage vast amounts of data.
The information that DNAnexus has deposited in Google cloud storage is, according to company officials, already one of the largest data sets in Google's history at several hundred terabytes.