Big Data: Either Embrace It or Be Among the Walking Dead (Thoughts on Strata West 2012)
I’ve been meaning to blog about Strata West for the last week or so but felt the need to take a step back and look at the conference objectively. Of course, we’ve also been very busy at PatternBuilders working on our latest release (where correlation is the king and financial services is the queen or vice versa), engaging with potential partners and customers, and all the other activities that make up a startup’s life. In other words, during and after the conference we’ve barely been able to catch our collective breath (as well as get some much needed rest)!
So before I talk about the conference as a whole as well as some of the sessions and folks that caught my eye and of course, our book signing event (yes, Terence and I signed many books for conference attendees), I wanted to give a final shout out to our stellar Big Data and SCM panelists: Lora Cecere, Pervinder Johar, and Marilyn Craig. Thank you all for participating and for taking on this very broad topic! Much ground was covered, including the need for more rigorous cold chain management to ensure the efficacy of drugs, the amount of food that is spoiled and thrown away (one out of every three fruits and vegetables and two out of every five chickens) due to poor logistics management, and how big data can be used to transform the auto repair industry. What I loved about this panel (and yes, I am admittedly biased) was that it focused on real world problems that companies, industries, and societies are dealing with today. By the way, our panel was part of Strata Jumpstart—billed as the missing MBA for Big Data and it certainly lived up to its billing!
Later, Terence was interviewed and his remark during the SCM panel about “companies that do not embrace big data are among the walking dead” (yes, it’s a good quote) was followed up on by the interviewer. Here’s Terence’s response:
“If you are one of those companies not embracing big data, particularly second generation big data solutions around streaming, like PatternBuilders and S4, I think that you’re part of the walking dead because your competitors are. This means that you will not be able to compete on delivering a service, understanding a market, or logistics… you will essentially let your competitors be better than you are.”
I really could not have said it better myself—and the idea of big data as a strategic imperative certainly seemed to be the focus of this year’s Strata West.
This segues quite nicely into some thoughts on the Strata conference as a whole. Last year, Strata West was the inaugural big data conference where the possibilities were endless but the realities were a bit farther away. This year, Strata 2012 grappled with the realities in session tracks that covered business/industry, data science, domain data, visualization and interface, policy and privacy, and of course, all things related to programming with Hadoop (or, as we like to call it at PatternBuilders: Big Data 1.0). Lots of sessions, lots of topics, great speakers with interesting things to say, and not-so-great speakers with interesting things to say! If you’re curious about the attendees, The Guardian has provided a visual representation (with some interactive menus!) of where the data enthusiasts came from and what they hoped to talk about.
Now, in keeping with last year, Strata videoed the keynotes (all are available at Strata’s video page) and is working on a complete video compilation of all workshops, sessions, and keynotes (for a fee unless you have an All-Access pass). Considering the number of sessions I wanted to attend but could not due to conflicts, I am glad I’ve got my pass! But anyone can watch the keynotes and with that in mind, I give you my top 5 (with links—you’re welcome):
- Avinash Kaushik—A Big Data Imperative: Driving Big Action. In this energetic keynote, Kaushik asks how do you find the unknown unknowns (full disclosure: there’s a great Donald Rumsfeld quote he used to propel the conversation)? He proposes that you must engage in smarter sorting (sorting by “interestingness”) and then predict, mine, and correlate. I would call the second part of this exploratory analysis.
- Luke Lonergan—5 Big Questions about Big Data. In this keynote, Lonergan likens big data to jazz (yes, jazz!). He goes on to say that “organizations will become more like jazz bands and less like orchestras.” That’s a great analogy—wish I’d thought of it! In the new normal (big data), Lonergan predicts that every interaction will be customized.
- Abhishek Mehta—Decoding the Great American Zip Myth. In this keynote, Mehta argues that we are “data rich but information poor.” He goes on to point out that “data has no 1% as we are all equally different” and makes the case for common platforms in other business segments outside of the web that are able to store, process, analyze, and visualize all of the data so that socio-economic problems can be addressed.
- Ben Goldacre—The Information Architecture of Medicine is Broken. In the “who knew?” category, Goldacre’s keynote focused on drug trials and how positive ones get overestimated while negative ones go underreported which leads to biased samples. To address this issue, trial protocol registers have been set up where all trials are registered but the registers are only one step towards identifying biased samples. Goldacre is trying to develop a more robust solution, that, through the use of analytics, can easily identify trial bias and is seeking our (the big data community) help. If you have some ideas, contact him at firstname.lastname@example.org.
- Steve Schoettler—Learning Analytics: What Could You Do with Five Orders of Magnitude More Data About Learning? In the nervous speaker but well worth listening to category, Schoettler’s keynote focused on improving academic outcomes with analytics. He argues that the move towards digital learning (where much more data is collected via the many devices now used in classrooms) means that we are now able to apply analytics to make formative assessments and tailor instruction (learning) to each student’s specific needs. As an aside, if you have not seen the 60 Minutes segment on Khan Academy that offers an extensive online video library to provide a world class education to anyone, anywhere for free, you should!
Wednesday evening Terence and I walked over to the O’Reilly booth to sign some of our Privacy and Big Data books for attendees. After two jam packed days at Strata, we were tired and thought that we would sign the books and go sit down somewhere and take a nap (we are no longer as young as we’d like to think!). However, that was not meant to be. Instead, we ran out of our books and found ourselves thoroughly engaged in privacy conversations that ranged from some very interesting data collection and usage policies to how privacy is viewed around the world and some great subject matter for upcoming posts.
We then ended the evening at a Friends of O’Reilly event where I met Joseph Boyle, an analyst with the Personal Data Ecosystem Consortium that is focused on ensuring that people (us) are in control of their personal data (we mentioned the consortium in our book). In the midst of our no holds barred privacy conversation regarding whether or not we can maintain control of our personal data, Joseph realized that I was the co-author (Terence too) of Privacy and Big Data and proceeded to show me the not-yet-available March release of the Personal Data Journal (a newsletter the consortium puts out on a quarterly basis) that features an extensive review of our book! Yes, it is a small world (and when the newsletter is available I will update this post with a link)!
Here are my final thoughts on Strata West 2012: noisier, messier, and far more crowded than previous conferences. And that’s as it should be because big data has entered the real world. There are second generation solutions that offer far more built-in capabilities and a much more user friendly experience. Big analytics is solving real-world problems, is being applied to more real-world problems that need our attention (when you have a minute check out Data Without Borders that uses big data to help humanity), and there are people from business, technology, and education all trying to figure out how to analyze data for competitive advantage or better yet, to shine a light on the most pressing issues of the day. Of course none of this would have occurred without the sponsorship of the O’Reilly team—program chairs, Edd Dumbill and Alistair Croll, and the rest of the O’Reilly team did a great job. Thank you!
And I will leave you with this (I cannot resist a good quote—even when it’s not my own): If you are not one of those companies embracing big data, be warned that you are among the walking dead!