Big data is quickly becoming an essential part of modern medical research. While claiming that big data can “cure” cancer or other serious diseases may be bit of an overstep, big data is a vital part of the research and discovery process, and recent advances will make it more important in the years ahead.

In fact, McKinsey estimates that big data will create $100 billion worth of value in the U.S. healthcare system alone.

Researchers and scientists have long desired the ability to crunch and analyze huge data sets. Indeed, this desire was one of the driving forces behind the development of computer technology. Until recently, however, even the most powerful computers have been hamstrung when it comes to dealing with truly massive amounts of data.

Consider that the first effort to map the human genome took thirteen years. Now, with performance clusters, mapping a genome can take mere days. Better yet, with powerful analytical solutions, a genome can be mapped in just hours.

In the medical field, big data is everywhere. In a sense, every living organism is a collection of “big data.” DNA by itself is a collection of data, a blueprint for life. Complex systems, such as the nervous system or respiratory system, can generate huge amounts of data. The way diseases spread through a population is yet another example of big data.

Being able to analyze, understand, and present this data could yield huge breakthroughs in medical research. Up until now, researchers have been forced to hone in on small data sets, potentially ignoring important causations and correlations present in bigger data sets. With advances in big data researchers can identify patterns that are often lost in the “noise” of big data.

In 2014, the world was in the gripes of fear as the Ebola virus ravaged West Africa and made its way to the U.S., the United Kingdom and elsewhere. Before that, the bird flu, SARs and various other diseases had caused huge scares across the globe. Every year brings a new epidemic threatening to spread across the world.

Unsurprisingly, epidemics have been the subject of numerous movies and books. Epidemics receive a lot of attention for good reasons, in the 14th century the “Black Plague” wiped out as much as half of Europe's population. The Spanish Flu, which ravaged the world in the years after World War 1, managed to kill off 20 million to 40 million people, more than the Great War itself.

Epidemics represent a prime example of big data. There are so many factors to consider. How fast is the disease spreading (r naught rate)? How could weather influence factors, like social interaction? What about public transportation systems? There are so many factors to consider that many health experts make their careers solely out of epidemiology, or the study of the spread of diseases.

Now, big data is being used to tackle this truly big problem. During the Ebola outbreak, for example, researchers launched a project called #HackEbola using mobile phone data routed through cellphone towers to track and approximate the movement of individuals. This helped researchers track and predict the spread of the disease, which is spread by human contact.

In the future, big data can help researchers examine more factors, build better predictive models that can project the spread of diseases and stay ahead of the spread of diseases. Armed with data they can stop these epidemics before they become widespread.

It's amazing to think that drugs have only been around for about a century. Penicillin, the first true antibiotic, was only discovered in 1928. While there are now hundreds of useful drugs, drug discovery is an expensive process, and drug research is grinding to a halt.

One of today’s biggest health threats is the emergence of antibiotic-resistant bacteria, which are untreatable with most current antibiotics; over time, microbes develop a resistance to these drugs. This isn't necessarily a threat, as long as researchers continue to develop new drugs.

Unfortunately, very few companies are researching the next generation of antibiotics. It's simply too expensive and time consuming, and, for many companies, the endeavor simply isn't economically feasible.

Big data, however, could help. With sophisticated big data methods, researchers can develop advanced predictive modeling efforts, search for drug candidates and even perform hypothetical tests to examine drug viability.

Big data methods can also be used to find patients for clinical trials and monitor unfolding clinical trials. The quicker researchers can uncover developments and patterns, the more effective drug trials will be.

Saying that big data can cure cancer is a bit of an overstep. However, it can help researchers make important discoveries, uncover patterns and gain key insights.

Already researchers at the German Cancer Research Center (DFKZ) have mapped the genomes of thousands of cancer patients in an effort to identify potential DNA problems at the root of cancer.

To be successful in this endeavor, the researchers needed to analyze the entire genome of each patient, which could generate 200 gigabytes of raw data per genome. Without big data analytics, this would take weeks per patient, at a risk missing huge chunk of potentially valuable data.

Fortunately, the Fujitsu Prime Flex Integrated System for Hadoop, a massively powerful supercomputer cluster running Datameer, offered the DFKZ a solution. The system is able to analyze entire genomes, meaning no data is lost, in a mere five to 20 minutes. As a result, the DKKZ now has the data they desire, and hopefully they'll be able to use it to make serious breakthroughs in the medical research field.

What secrets might big data and powerful analysis offer in the future? If anything, it seems that we’re just entering the “big data age,” with software solutions and computing technologies that finally enable researchers to tackle and take on even the largest sets of data.

With more data being analyzed and more information being uncovered, researchers will be able to make big breakthroughs in a wide range of fields.

Register or login for access to this item and much more

All Health Data Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access