Posts tagged ‘O’Reilly’
I had to miss Strata due to a family emergency. While Mary picked up the slack for me at our privacy session, and by all reports did her usual outstanding job, I also had to cancel a Tuesday night Strata session sponsored by 10Gen on how PatternBuilders has used Mongo and Azure to create a next generation big data analytics system. The good news is that I should have some time to catch up on my writing this week so look for a version of what would have been my 10Gen talk shortly. In the meantime, to get me back in the groove, here is a very short post inspired by a Forbes post written by Dan Everett of SAP on “Hadoopla”
As a CEO of a real-time big data analytics company that occasionally competes with parts of the Hadoop ecosystem, I may have some biases (you think?). But I certainly agree that there is too much Hadoopla (a great term). If our goal as an industry is to move Big Data out of the lab and into mainstream use by anyone other than the companies that thrive on and have the staff to support high maintenance and very high skill technologies, Hadoop is not the answer – it has too many moving parts and is simply too complex.
To quote from a blog post I wrote a year ago:
“Hadoop is a nifty technology that offers one of the best distributed batch processing frameworks available, although there are other very good ones that don’t get nearly as much press, including Condor and Globus. All of these systems fit broadly into the High Performance, Parallel, or Grid computing categories and all have been or are currently used to perform analytics on large data sets (as well as other types of problems that can benefit from bringing the power of multiple computers to bear on a problem). The SETI project is probably the most well know (and IMHO, the coolest) application of these technologies outside of that little company in Mountain View indexing the Internet. But just because a system can be used for analytics doesn’t make it an analytics system…..“
Why is the industry so focused on Hadoop? Given the huge amount of venture capital that has been poured into various members of the Hadoop eco-system and that eco-system’s failure to find a breakout business model that isn’t hampered by Hadoop’s intrinsic complexity, there is ample incentive for a lot of very savvy folks to attempt to market around these limitations. But no amount of marketing can change the fact that Hadoop is a tool for companies with elite programmers and top of the line computing infrastructures. And in that niche, it excels. But it was not designed, and in my opinion will never see, broad adoption outside of that niche despite the seeming endless growth of Hadoopla.
Mary and I had a great time – and a couple of good arguments during our webcast “Privacy and Big Data: Is there room for privacy in the age of big data?” last week. The kind folks at O’Reilly have just made a recording available in case you missed it. You can find it here. O’Reilly is also offering 50% off of all its Ebooks (offer expires September 28) including ours so go grab it. The discount code B2SDEAL. We would love to hear your comments on the webcast and the book – either as comments on this post or hit us up on twitter @terencecraig, @mludloff or firstname.lastname@example.org.
Our e-book is entering production next week which is the final step before publication. Mary and I are very happy with the result and hope you will be as well.
We will let you know when we have the official release date. But as a teaser, here is the cover (which we love). However, I will admit to being pretty distressed when I found out that because this was a security book we weren’t allowed to get one of the famous O’Reilly animal covers. But then again, since O’Reilly authors are never told why a particular animal is picked for a book by the talented Ms. Friedman and given that mysteries drive me nuts its probably just as well! 😉
While all the keynotes at Strata were interesting, one stood out in particular: Scott Yara’s “Your Data Rules the World.” For those of you not in the big data space, Scott’s take on the amount of data being churned out by the government, internet, phone and tv, as well as the financial, medical, and retail industries and what is, and should be, done with it is well worth its 15 minute runtime.
But what fascinated me was how Scott started his session. He talked about himself from the viewpoint of the Internet. He showed where he lived (via Google Maps), showed a small section of his college transcript, showed how you could find out what he paid for his house, showed what his house was currently worth, showed the property taxes pending, and finally, showed a snapshot (from a website) of him running a red light that resulted in a traffic ticket. Now, if you read a previous post of mine about how much information is publicly available about you, you’re not surprised by this. Neither was I. (more…)