A top-level view of our data project over a series of posts.
By Mary Ludloff
Welcome to the third post in our series on a big data project. Our goal is to walk you all the way through a big data project from its inception through its completion (or depending on the project, through deployment and maintenance). Those of you familiar with our series know that we include our Big Data Playbook rules as we address specific topics—we may repeat some as we go along but if you need to refresh your memory on where we are, go to Part 1 and Part 2.
You now know that we are working with the University of Sydney on a project that looks at the impact social media comments have on a company’s stock and whether this mediates the influence of primary news. Specifically: Is a company’s stock price influenced by both and can we isolate and study the impact of those distinct sources on that stock price? (more…)
Sadly, this week we were reminded once again of the fragility of life and the resilience of the human spirit. Terence, myself, and the PatternBuilders team send our condolences to all who were impacted by this tragedy. For those who would like to help, donations can be made to:
- The One Fund—specifically formed to help those most affected by the bombings.
- The New England Patriots Charitable Foundation—all donations denoted with the words “Boston Marathon” will be earmarked for The One Fund.
- Boston’s First Responders Fund—also specifically established to benefit the victims of the bombings.
A number of resources can also be found here.
Much as it pains me to say this, beware of bogus Boston Marathon charity websites. Melanie Hicken of CNNMoney offers some advice on what to look out for.
Finally, there have been many moving tributes made by people via blogs, twitter, and other media sources. We leave you with this simple statement projected on the wall of the Brooklyn Academy of music:
In my last post, I wrote about the three V’s of big data and why there are only three. There has been a messaging pile-on that seems to be happening in the big data space that even I, long-time marketer, find disconcerting. So, over the course of a number of posts, my colleague, Marilyn Craig, and I are going to de-mystify a big data project, taking apart each stage of a real big data initiative as if it were a release post-mortem. We will be talking about roles and responsibilities, data governance, project and process management, what went right, what went wrong, what we should have done differently. Except in this case, it will not be after the fact but rather a stage-by-stage review as we work on a real-world project. For your sanity and ours, we have created a special category, Big Data Project, as well as a tag with the same name. If you search on either, you will see all posts related to the project. Additionally, all posts about the project will start with Big Data Project in the title. Who knows? Maybe when we’re done, we’ll write a book (knowing what I know now about writing a book, I can’t believe I just said that)!
We’ll talk more about the project in the next post but first I wanted to take a look at a big data failure that anyone involved in a major enterprise application deployment could have seen coming and is Rule #1 in our big data playbook:
Rule #1: Big Data IS NOT rocket science.
Marilyn Craig (Managing Director of Insight Voices, frequent guest blogger, marketing colleague, and analytics guru) and I have been watching the big data “V” pile-on with a bit of bemusement lately. We started with the classic 3 V’s, codified by Doug Laney, a META Group and now Gartner analyst, in early 2001 (yes, that’s correct, 2001). Doug puts it this way:
“In the late 1990s, while a META Group analyst (Note: META is now part of Gartner), it was becoming evident that our clients increasingly were encumbered by their data assets. While many pundits were talking about, many clients were lamenting, and many vendors were seizing the opportunity of these fast-growing data stores, I also realized that something else was going on. Sea changes in the speed at which data was flowing mainly due to electronic commerce, along with the increasing breadth of data sources, structures and formats due to the post Y2K-ERP application boom were as or more challenging to data management teams than was the increasing quantity of data.”
Doug worked with clients on these issues as well as spoke about them at industry conferences. He then wrote a research note (February 2001) entitled “3-D Data Management: Controlling Data Volume, Velocity and Variety” which is available in its entirety here (pdf too). (more…)
Greetings one and all! 2012 was a breakout year for PatternBuilders and we are very grateful to all of you for helping to make that happen. But we would also like to take a minute to extend our condolences and share the grief of parents across the world that lost young children to violence. Newtown was singularly horrific but similar events play out all too often across the globe. We live in an age of technical wonders—surely we can find ways to protect the world’s children.
This is our last post of 2012 and in the spirit of the season, we decided to do something a little different this year. Recently, the Wall Street Journal asked 20 of its “friends” to tell them what books they enjoyed in 2012 and the responses were equally eclectic and interesting. Not to be outdone, Adam Thierer published his list of cyberlaw and info-tech policy books for 2012. Many of the recommendations culled from both sources ended up on our reading lists for 2013 (folks, 2012 is almost over and between launching AnalyticsPBI for Azure and working on our update for Privacy and Big Data, not a lot of “other” reading is going to happen during the holiday season!) and spurred an interesting discussion about our favorite reads of the year. One caveat: Our lists may include books we read but were not necessarily published this year. So without further ado, I give you our favorite reads of 2012! (more…)
A week ago, I was in New York City for Strata’s Big Data Conference. The weather was sunny and mild and as I walked around the City I was reminded of just how vibrant it is and told my husband later that evening that we have to visit it more often. After the conference, I headed home and then watched with disbelief as this wonderful city, surrounding areas, and many more states were engulfed by Hurricane Sandy. I was saddened by the destruction and loss of life, but today am reminded of the resilience of its inhabitants as the clean up and rebuilding begins. For those of you interested in helping, I point you to ABC News’ story and the Wall Street Journal’s article on ways to help the storm victims. Or you can go to the Red Cross home page for information on how to make a financial donation or give blood. To all of you on the East Cost impacted by Hurricane Sandy: Our hearts go out to you and you are in our prayers.
Let me tell you a little secret: I always know when I am talking (and working) with a company that has successfully launched big data initiatives. There are three characteristics that these companies share:
- A C-level executive runs the “[big] data operations.”
- The Chief Data Officer (even if they are the CIO) has a heavy business/operations background.
- The data team is focused on the “business,” not the data.
Did you notice that technology and data science are not reflected in any of the characteristics? Some of you may consider this sacrilege—after all, we are operating in a world where technology (and I happily work for one of those companies) has changed the data collection, usage, and analysis game. Colleges and universities are now offering master degrees in analytics. The role of the data scientist has been pretty much deified (I refer you to Part 1 of this series). And we all need to be very worried about the “talent shortage” and our ability to recruit the “right analytical team” (I refer you to Part 2 of this series).
Yes—technology has had a tremendous impact on how much data we can collect and the ways in which we can analyze it but not everyone needs to be a senior computer programmer. Yes—we all should strive to be more mathematically inclined but not all of us need Master’s or PhD’s in statistics or analytics. Yes—some companies, based on their business models, may have a staff of data scientists but others may get along just fine without one (with the occasional analytics consultant lending a hand). (more…)
By Marilyn Craig, Managing Director, Insight Voices
As you may or may not know, we are in the midst of a 3-part series on data science, covering roles, skills, etc.—generally what you should think about as well as what’s not as important (no matter what the latest articles say!). For Part 2, we have a guest poster—Marilyn Craig of Insight Voices. Marilyn is what I like to call a “classic quant.” She has been at the forefront of big data and data science before most people knew these terms (and spaces) existed and has been my go-to person whenever I had an analytics question (see title) that I needed an answer to. In this post, Marilyn looks at insights and makes the case for why we should all care far more about answers. Take it away Marilyn!
Here’s an interesting question for this new world order of Big Data Analytics: what’s an Insight and what’s an Answer? Sometimes they are the same, sometimes not. An insight is a piece of information or understanding. It may or may not be useful. It may or may not help your business improve, solve world hunger, or even make sense. An answer is always useful. It is the result of asking a question. And the best kinds of answers are those that solve the questions that you really care about. (more…)
In Search of Elusive Big Data Talent: Is Science Big Data’s Biggest Challenge? Or Are We Looking in the Wrong Places? (Part 1 of 3)
When we talk to prospects about their big data initiatives our conversations usually revolve around issues of complexity that goes something like this:
“Big data is so big (no pun intended), there’s such a variety of sources, and it’s coming in so fast. How can we develop and deploy our big data projects when everyone is telling us that we need lots and lots of data scientists and oh, by the way, there aren’t enough?”
Admittedly, many media outlets and pundits are positioning the search for skilled big data resources as what I can only characterize as the battle for the brainiacs. Don’t get me wrong, I am not disputing McKinsey’s report on big data last year that made it clear a talent shortage was looming, estimating that the U.S. would need 140,000 to 190,000 folks with “deep analytical skills” and 1.5 million managers and analysts to “analyze big data and make decisions based on their findings.” But the hype surrounding the data scientist is getting a bit absurd and we seem to be forgetting that those 1.5 million managers and analysts may already be “walking amongst us.” Is a shortage of data scientists really big data’s biggest challenge? (more…)
There are times when Terence and I look at each other and say, “What on earth were we thinking?” And this is one of those times! PatternBuilders is crazy busy right now putting out release 3.0 of our Analytics Platform (the secret sauce for our analytics applications that we like to call data-science-in-a-box), ramping up on a funding round, working with partners on a University of Sydney research project on the impact of social media on a company’s stock price (a really fun project and a post about it is in the works), and, of course, supporting customers and prospects on their big data initiatives. So… since we did not have enough to do (sarcasm on), we decided it was time to update our book, participate in a pre-Strata East webcast, speak at the Strata Conference and the MongoDB User Group (that is collocated with Strata) in New York City! In the words of the immortal Bette Davis in All About Eve (and ever so slightly revised):
“Fasten your seat belts, it’s going to be a bumpy night ride!”
Really, what were we thinking????? (more…)