Launched by the National Cancer Institute (NCI), the Genomic Data Commons (GDC) is a unified data repository, freely accessible to all researchers. Users will be able to download data for use in research, and will also be encouraged to upload their own findings to share with others.
Launching the GDC at the American Society of Clinical Oncology meeting in Chicago, Vice President of the USA, Joe Biden, said: 'It is our hope that the Genomic Data Commons will prove pivotal in advancing precision medicine.'
'More than any other specialty, oncologists have to explore the unknown with their patients. No single oncologist or cancer researcher can find the answers on their own,' he said. 'It requires open data, open collaboration, and above all open minds.'
The sheer size of genomic datasets has been a major hurdle in implementing such an initiative to date, but the GDC attempts to overcome this issue. Using software algorithms, the GDC aims to both centralise and standardise data from large-scale NCI programs, such as The Cancer Genome Atlas and TARGET. These datasets alone comprise more than two petabytes of data – one petabyte is equivalent to 223,000 DVDs filled to capacity with data.
'These datasets will lead to a much deeper understanding of which therapies are most effective for individual cancer patients. With each new addition, the GDC will evolve into a smarter, more comprehensive knowledge system that will foster important discoveries in cancer research and increase the success of cancer treatment for patients,' said Dr Louis Staudt, Director of the NCI Center for Cancer Genomics.
Currently, the data is stored and managed on a private cloud network at the University of Chicago in collaboration with the Ontario Institute for Cancer Research, all under an NCI contract with Leidos Biomedical Research.
The data will be available for further analysis and there will be safeguards in place to ensure data protection. Users' authorisation to access the data will be checked and, while some data will be publicly accessible, other types of data will be under controlled access.
The GDC also operates under the NIH's genomics data sharing policy – for the researcher this means that any study concluded under the auspices of the NIH must be shared within six months of completion.
The initiative is a central part of President Barack Obama's $1 billion Precision Medicine Initiative for Oncology, with funding for the GDC coming from the $70 million allocated to the NCI for the National Cancer Moonshot Initiative.
Dr Allison Heath, GDC director of research, said: 'The initial phase is to get the data harmonised and get that released, and then over the next year or two, we'll have multiple phases where we'll have other things that it has been built to do. But we need to make sure we test and work to make sure we support the community.'