Big data plagued businesses in 2013. The IT community should expect no less in 2014, as businesses of all sizes face the challenges big data presents and seek out ways to manage data growth.
Many companies succeed in implementing a strong information management strategy that aides in processing all of this information, while others stumble with big data management. But how these businesses should be handling big data seems to change depending on the expert, creating confusion around what the best data management strategy may be. While the topic of big data has existed for many years, several myths continue to perpetuate and add to the already complex nature that surrounds it . Here we take a look at five myths about big data that I hope cease to exist as we move into the New Year.
Myth: Simply querying as much big data in an in-memory database will provide a suitable answer.
Reality: Contrary to popular belief, big data needs to be treated in the same way that small data must. Powerful in-memory technologies can crunch data sets tremendously quickly. But the underlying data that is being processed is central to these queries. Simply running data through the in-memory database will not necessarily provide you with correct results. In fact, it can ultimately lead to false positives that could have disastrous results. Just because a company has a good tool doesnt mean that it has good data quality underneath. Companies must remember that what makes sense on small data will ultimately make sense on big data. Otherwise they will simply be creating fast trash data information quickly produced, but with little substance.
Myth: The unstructured data that exists in big data can be easily overcome with technology such as Hadoop.
Reality: Unstructured data, when used appropriately, can provide businesses with a wealth of knowledge that can be translated to actionable items. Unfortunately, unstructured data is fickle in its nature and requires a strong use case before it can help shed the insight that businesses seek. For example, mining Twitter data with Hadoop can provide marketers with plenty of information, but they must ultimately find a way to cut through that noise and drill down to a specific customer set. Only if they go into the situation with a particular business goal in mind can users find correct data. If not, theyre sitting on a pile of information without any context.
Myth: Big data and information management is too time-consuming and labor intensive for our company to handle on a consistent basis.
Reality: Indeed, big data projects are no small task from the outset, and running queries on the data requires careful planning at every stage. But the trick is to make sure that once a query has been run, it can be repeated relatively quickly for future use. Businesses must avoid starting fresh every time a new project arises. Automated processes and including previously uncovered insights in future queries will help the business scale more effectively as it moves forward. Remember: managing data effectively from beginning to end can reduce the headaches and time generally associated with big data projects.
Myth: The more big data a company houses, the greater its competitive advantage over other companies.
Reality: Just because a company has volumes and volumes of data does not necessarily mean that it will be able to glean beneficial insights from it. This myth returns to the issue of data quality a company that does not have strong, organized data prior to running queries will not benefit from the big data at its fingertips . Enormous data sets can be extraordinarily complex, make data less manageable and reduce the overall quality of the data. Examining a carefully selected smaller data set can offer infinitely greater returns to a company than crunching large data sets ad nauseam. Companies that approach big data carefully and with consideration of what data is most critical to examine will be able to get the most out of their available information. Housing all the data in the world wont do a company any good if there is not strategy implemented to how the data is actually managed.
Myth: Only big companies face issues associated with big data.
Reality: Whether a Fortune 500 company or a start-up running out of a basement, business decisions are driven by the data that the company possesses. So while a large multinational is going to encounter more issues of big data, a small company will face some type of big data issue relative to its size. Many small-to-midsize businesses recognize that big data is an area that must be carefully examined as their company grows. According to a recent online poll conducted by Harris Interactive on behalf of SAP, 76 percent of SMBs view big data as an opportunity for growth. This suggests that not only are SMBs already encountering big data, but they are actively looking for ways to manage it as they scale their business.
As 2013 was year filled with big data discussions, we hope that these myths start to disappear in the minds of the IT community in 2014. The New Year will certainly bring a new set of myths, but we must do our best to cease the perpetuation of misconceptions about big data that confuse those less familiar with the topic.
Don Loden is a principal consultant with full lifecycle data warehouse and information governance experience in multiple verticals. He is an SAP-certified application associate on SAP Data Services, and he is very active in the SAP community speaking globally at numerous conferences and events as well as publishing regularly for various industry books and magazines. He has more than 14 years of information technology experience in the following areas: ETL architecture, development, and tuning; logical and physical data modeling; and mentoring on data warehouse, data quality, information governance, and ETL concepts. You can contact Don by email at firstname.lastname@example.org and find Don on Twitter @donloden.
This story originally appeared on Information Management.
Register or login for access to this item and much more
All Health Data Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access