FDA Taps into Big Data through Cloud

The U.S. Food and Drug Administration is embracing cloud computing to handle the enormous amounts of data the regulatory agency receives from providers, vendors, researchers and others.


The U.S. Food and Drug Administration is embracing cloud computing to handle the enormous amounts of data the regulatory agency receives from providers, vendors, researchers and others.

In 2014 alone, FDA expects to receive between 1.5 million and 2 million submissions through its eSubmission Gateway--the central transmission point for sending information electronically to the agency--with some reports as large as a terabyte in size.

“These data sets are not only larger than ever before, they are also arriving more frequently than ever and varying enormously in format, and quality,” says Taha A. Kass-Hout, M.D., chief health informatics officer and director of FDA’s Office of Informatics and Technology Innovation.

However, Kass-Hout views this data onslaught as both a challenge and an opportunity. “To meet both, we are building an innovative technology environment that can handle vast amounts of data and provide powerful tools to identify and extract the information we need to collect, store and analyze,” he says.

Cloud computing is helping FDA, through partnerships with state and local health organizations, to identify thousands of foodborne pathogen contaminants each year. The agency sequences, stores and analyzes this data to understand, locate, and contain life-threatening outbreaks around the country.

Earlier this month, FDA launched openFDA, a new initiative to make it easier for web developers, researchers, and the public to access public health datasets collected by the agency. Complying with a presidential executive order that government data be more accessible, openFDA offers data in a structured, computer readable format to quickly search, query, or pull massive amounts of public information from FDA datasets on an as-needed basis.

“OpenFDA is beginning with an initial pilot program involving the millions of reports of drug adverse events and medication errors that have been submitted to the FDA from 2004 to 2013 and will later be expanded to include the agency’s databases on product recalls and product labeling,” says Kass-Hout.