If business units are clamoring for intensive analytics that will stretch your organization’s storage capabilities to the maximum, it’s time to get familiar with Hadoop. As a flexible, open source data storage technology, Hadoop offers improved processing at just 5 percent of the cost of relational database technology.

I know what you are thinking: Even potentially cost-saving technologies have a cost, namely the learning curve and time that goes into figuring out how to use them. Luckily, Hadoop lends itself well to pilot projects. So if you’ve got a business unit pestering you for guidance on how to manage the costs of a text-, audio- or visually-rich analytics project, consider adding Hadoop to your arsenal.

Also See: Hadoop as a Service - 18 Cloud Options

Hadoop has seen explosive growth of 60 percent (per a TWDI survey) in the last 2 years with a projected 60 percent of projects expected to be in production by 2016. Not surprisingly, the top initiative expected to see benefit is advanced analytics, which requires moving and storing large amounts of data. 

Hadoop lets you to invest less in storage and more in turning that data into information that can be used strategically to grow revenue and profit. Here are three great reasons to adopt Hadoop:  

* Distributed computing models that can quickly process very large volumes of data. The more computing nodes you use, the more processing power you have.

* Scalability that allows you to grow your system simply by adding more nodes. Little administration is required.

* Flexibility that means you don’t have to pre-process data before storing it. And that includes unstructured data like text, images and videos. You can store as much data as you want and decide how to use it later.

Hadoop is designed to self-heal. Data and application processing are protected against hardware failure and high availability is built-in. If a node goes down, jobs are automatically redirected to other nodes to make sure the distributed computing does not fail. And it automatically stores multiple copies of all data.

All of this is incredibly important to the use of advanced analytics. By breaking free of traditional relational databases, your organization is better equipped to:

* Know the unknown: spot unexpected trends and correlations in vast amounts of data and use them to develop competitive advantage

* Predict and forecast: build scenarios to test the potential impact of decisions and actions, and know what steps are needed to reach financial and strategic goals

* Adapt faster: evolve as the impact of decisions or unexpected changes take effect, make every decision based on the most up-to-date information

* Demonstrate success: quickly and easily measure the outcomes of decisions.

Dancing with Elephants

It’s not all roses: some of the top barriers to adoption include linking Hadoop initiatives to business value, governance, security and a lack of skills required to manage data on Hadoop. But packaged software offerings can augment these gaps with simplified data access, profiling, data masking and self-service data preparation tools to improve productivity and governance for both business users and data scientists.

Hadoop Successes in the Real World  

One large travel company used Hadoop to help them get a more holistic view of their customer across the organization. By using social media and transaction information they created a view that allows them to predict the next best customer offer and gain competitive insight. Hadoop and advanced analytics helped them achieve a triple-digit percentage uplift.

A major online retailer uses Hadoop with its advanced analytics to drive increased customer insights; cross-sell and up-sell effectiveness; productivity, revenue and customer satisfaction. It has experienced a 20 percent reduction in churn rate and more than $500,000 savings in productivity annually with Hadoop and advanced analytics.

An electronics manufacturer has been able to apply text analytics to volumes of social media with great success by using Hadoop as a base for storage. It halted a significant keyboard overhaul after a positive review on an obscure blog garnered thousands of positive comments on the existing keyboard. It found this through a data-intensive text analytics program. The company has also been able to quickly mine customer warranty call comments and social media to tease out the root cause of customer problems months earlier than before.

These examples would not have been possible a decade ago because the storage and moving costs of mining texts, website clicks, customer transactions and more would have been cost-prohibitive. So instead of looking askance when a business unit asks for assistance on a data-intensive project, look at Hadoop as a way to make the previously impossible, possible.

Matt Magne is global principal product marketing manager for SAS Data Management

Register or login for access to this item and much more

All Health Data Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access