Why Big Data Needs Big Analytics
I have been involved with databases and analytics for the last 20 years or so. In fact, I remember when relational databases first started to displace IBM’s, Digital’s, and HPs ‘proprietary databases’ like IMAGE and RMS. I also remember the heated arguments about whether QUEL or SQL would win the query language wars (IMHO the worst language won).
It was an exciting time. Data, and how to manage it, was the focal point of the entire technology industry. But as relational databases became ubiquitous, the focus rightly turned: now that we have stored all that data, what the #@$@$@ do we do with it all?
Cue trumpets please.
And thus, the Business Intelligence/Data Warehousing industry was born. As a result, tools like SAS and Excel became increasingly important in helping users analyze the transactional data now stored in EssBase, Hyperion, Business Objects, Red Brick, etc.
All of this “seemed” to work well enough, so the industry moved on to focus on other problems: better hardware (RISC, CISC, Power PC), operating systems (UNIX, Linux, Windows NT), new programming languages and paradigms (4GL, OO, Client Server, N-Tier, Java, .NET).
And then, the paradigm shift of paradigm shifts occurred, more commonly known as the World Wide Web.
Suddenly all business was worldwide, pure middle men were out of business (“disintermediated” – was probably the most overused word in venture pitches during the bubble), new business models were created and organizations were forced to morph their IT infrastructures to address an entirely new paradigm. Data and databases faded into the background. The focus shifted, rightfully, to figuring out how to deal with broadly distributed applications, with thousands of users, huge transaction volumes, all built on top of a stateless protocol (HTTP) that was designed for document retrieval, not applications.
It wasn’t fun, but we really did change the world and in the process, created so much data that our old database systems started to show their age. So here we are, where web scale databases, aka Big Data, are the new technology “it girl.”
Big data, its advantages and challenges, is one of the hottest technology, business, and even social media topics. We’ve got a dedicated conference, lots of press, interest in the venture community, and great technology all focused on solving Big Data problems. Technologies like CouchDB, Cassandra, HBase, Neteeza, TeraData, Big Table, Redis, Raven DB, and our personal favorite MongoDB, are all offering different ways to manage data in the multi-petabyte range and have brought innovation and excitement back to the data management industry
And that’s all good, and considering how much is data is created on a daily basis worldwide, extremely necessary. But data, in and of itself, is not enough. Colin Clark said it best: “I define big data as when you can’t turn your data into actionable intelligence fast enough to have an impact during the window of opportunity.”
Exactly! Big Data needs Big Analytics and given that current BI and analytic systems were struggling to provide value before the Net caused an exponential explosion in digital data, it is clear that help won’t be forthcoming there. What is Big Analytics? What should a Big Analytics system do? What does the technology look like? Which approach did we use? Why? How does this compare to more batch oriented systems like Hadoop? Beyond technology what makes Big Analytics successful? We will hit these topics in depth in the new year.