Researchers Seek Funds for Genomic Data Sets in Cloud

Prominent researchers are calling on government agencies to fund a global genomic data “commons” in the cloud to serve as a worldwide research resource driving development of next-generation treatments.


Prominent researchers are calling on government agencies to fund a global genomic data “commons” in the cloud to serve as a worldwide research resource driving development of next-generation treatments.

Without support for this kind of cloud infrastructure, they say thousands of researchers around the world will continue wasting time and money independently transferring data from repositories to the cloud of their choice. However, by taking “full advantage of the possibilities that cloud computing offers” with such a data commons, researchers from Canada, Europe and the U.S. writing in the latest issue of the journal Nature argue that “authorized scientists would be able to tap easily and cheaply into a global commons as and when they need to.”

A typical university connection can take months to download datasets from major international projects like the International Cancer Genome Consortium and the hardware costs associated with storing and processing those data can also prove quite expensive, states the article. But, with cloud computing, a data set from a big genome project can be executed in days, at a fraction of the price.

Also See: IBM Joins With Apple, J&J to Analyze Health Data in the Cloud

“Moving data and analysis tools to the cloud will democratize access to data and to the computational resources required to analyze that data,” said Gad Getz, director of the Cancer Genome Computational Analysis Group at the Broad Institute of MIT and Harvard, and a co-author of the article. “The expanded access will accelerate tool development, grow the population of researchers analyzing these rich data sets and ultimately increase the pace of scientific discovery. These cloud-based analysis platforms will also enable the testing of new distributed computing paradigms which expand both the scale of the analyses and the sophistication of the computational algorithms.”

Late last month, the Broad Institute—a biomedical and genomic research center in Cambridge, Mass.—announced a partnership with Google Genomics to use its cloud-computing platform to store, analyze and share data which Getz says will serve as a pilot for these cloud-based capabilities.

To take full advantage of the benefits of cloud computing, the authors are urging the National Institutes of Health and other major funding agencies to pay for the storage of major genomic data sets in the most popular cloud services such as Amazon, Google and Microsoft. They point out that traditionally opposition to storing vast scientific data sets on cloud-computing platforms has been based on security concerns. Yet, the authors assert that storing data in the cloud has been shown to be as secure, if not more secure, than storing it locally.

For its part, NIH earlier this year lifted its 2007 ban on using cloud computing to store and analyze the tens of thousands of genomes and other genetic information held in its own databases. And, the article points out that NIH’s National Cancer Institute has several pilot projects “exploring the practicalities of sharing and analyzing genomic data on clouds” while “NIH and other funding agencies are already discussing a variety of ‘biomedical commons’ concepts, which incorporate several of the ideas proposed here.”

More for you

Loading data for hdm_tax_topic #better-outcomes...