Project Data Sphere enables wide use of clinical trial data
Despite the billions of private and federal dollars spent on cancer research each year, it is estimated that nearly one in every three people will hear a cancer diagnosis in their lifetime.
Project Data Sphere, a digital library-laboratory platform, collects data from clinical trials and makes it available to the public for secondary analysis in the hopes of finding a cure.
However, the data comes from an unlikely partner—big pharma.
“There are some companies out there that see this as they’re giving away their intellectual property and, to an extent, they are,” says Matt Gross, director of the SAS Health Care and Life Sciences Global Practice. “If nothing else, they’re giving away literally billions of dollars of time, money and effort in the collection of this information and making it available to the public.”
Gross adds that this willingness amongst competitors to share proprietary information for the greater good reflects an evolution in healthcare. But it took a lot of work to even get representatives from competing companies in the same room.
Martin J. Murphy, founding chief information officer of the CEO Roundtable on Cancer, which funds Project Data Sphere, says pharmaceutical companies were initially worried about counterintelligence if they were to set a meeting.
“There was a lot of fear, but not a lot of basis for it,” he says. “You could see people were getting uncomfortable. ‘We never shared that data. Why should we start?’ ”
It took a Department of Justice representative to sit in on a meeting with the CEO Roundtable on Cancer for the pharmaceutical companies to realize “certain things can and should be done together,” Murphy says.
The nonprofit launched Project Data Sphere in April 2014, and partnered with SAS to create an analytics toolkit. Third-party researchers, along with the general public, can log in to the platform and look through completed clinical trials for broader research beyond the pharmaceutical companies’ original intent for conducting the trials.
There are currently more than 900 authorized users on the platform with access to 49 datasets representing 27,000 patient lives across a broad array of cancer tumor areas. More than 2,000 data downloads for research purposes have occurred since its launch.
“The goal here was really to find a way by which the data from these individual patient data sets could be put into a single place that people could easily find them and then use them as they saw fit,” Gross says.
Project Data Sphere does face a common issue in the data-sharing realm: standardizing data sets. The uniform input of data, or data standards, allows for clinical trials to move between researchers, doctors and insurers without wasting time or resources deciphering it.
Jyotishman Pathak, division chief of health informatics at Weill Cornell Medical College, says he sees the shift in adherence of data standards where, historically, clinical trial researchers had not been using the standards.
“It has been a painful process for some researchers,” Pathak said, “but I think the Affordable Care Act has done huge benefit in terms of mandating the use of standards broadly across the system.”
Data from clinical trials prior to 2010, when the Affordable Care Act was signed into law, might not have employed the same data standards as those conducted after. Gross says the standardization of various clinical trials is up to the individual company that funded it.
Project Data Sphere acts as a curator, Murphy says, where the data is de-identified, reformatted and transformed into “pure data compared to big, raw data. Everything in those data sets is transparent,” he adds.
Both SAS and the CEO Roundtable on Cancer are working on building the database to include more research. Last year, the CEO Roundtable on Cancer also launched Prostate Cancer DREAM Challenge, which focused on creating new models for diagnosis and treatment of prostate cancer, a disease that will affect about 14 percent of men, according to the National Cancer Institute.
“These initiatives and these efforts are coming from a variety of different sectors of the healthcare and clinical research world,” Gross says. “It’s not just a standard or a subgroup, but it really is how to help inform data sharing by creating environments where these groups can come together and say, ‘These are the standards we’d like to promote.’ ”