Big Data Project: Objectives First, Plan Second (Part 3)

May 15, 2013 at 5:46 pm 1 comment

A top-level view of our data project over a series of posts.
By Mary Ludloff

big data playbook 3Welcome to the third post in our series on a big data project. Our goal is to walk you all the way through a big data project from its inception through its completion (or depending on the project, through deployment and maintenance). Those of you familiar with our series know that we include our Big Data Playbook rules as we address specific topics—we may repeat some as we go along but if you need to refresh your memory on where we are, go to Part 1 and Part 2.

You now know that we are working with the University of Sydney on a project that looks at the impact social media comments have on a company’s stock and whether this mediates the influence of primary news. Specifically: Is a company’s stock price influenced by both and can we isolate and study the impact of those distinct sources on that stock price?

We then went into discovery mode (Rule #4: Ask questions about the question) until we had a thorough understanding of the question (see Table 1). And now it’s time to make a plan—well actually, it’s time to make a list of objectives which will serve as the launching pad for our plan. This, of course, leads us to our next rule:

Rule #5:  Objectives first, plan second.

Why objectives first? Well, there’s a natural tendency to drop into the planning phase before you’ve thought out what you’re trying to do in what I like to call “big animal pictures.” I call this descending into the weeds before you’ve got a general idea of what you need to do. Projects, big data or otherwise, generally begin with determining your objectives and then breaking down the resources and tasks needed to complete the project.

Make a list blog post SM

Yes, this is a simplistic view of a project as each area can be pushed down and out into all layers of complexity. And that’s my point. At the start of the planning phase of a big data project you don’t want to push down—that will come later and will become an iterative process as you gain information. Rather, you want to sketch out your objectives based on your understanding of the question and then figure out the resources and tasks needed (our original table that provided context to the question is included here for reference). Once you have that, you can then address data, technology, and partner requirements as well as identify gaps in all those areas that will need to filled.

Table 1: Ask Questions About the Question

Table 1: Ask Questions About the Question

Since we did a deep-dive into exactly what the original question meant, it’s now time to figure out our objectives. After talking through what social media channels we wanted to focus on with Dr. Briley, we decided to analyze the impact of tweets on a company’s stock price. There’s been a great deal of research on how the Twitter mood can impact the stock market in general, but few projects have taken on the task of looking at a specific company—this seemed like a good area for us to focus on.

We also had to analyze the impact of primary media sources on price and isolate that impact from follow-on tweets. First, we had to select a news source. In this case, Reuters seemed like the  natural fit as they are the market leader in this area. Additionally, Reuters could also provide sentiment analysis along with other data that would help us to measure sentiment and influence as well as “study the impact” (see Table 1).

Now we moved on to the “heart” of the question: how can we determine and isolate the propagation mode of “company news” from the reporting of financial news in Reuters to tweets about that information? Naturally, we also wanted to explore the different aspects of a tweet that might make it more or less influential. There are a number of tools available that measure some aspect of social authority but for this project we focused on the following:

  • The volume, velocity, and acceleration of tweets generated after a news article reports finanical information.
  • The social authority (or influence) of the twitterer as indicated by his/hers Klout score and number of followers.

Finally, once we had all this data how would we determine (algorithymically) the impact both (and singularly) sources had on a company stock price?

Based on what we just covered, here are our four objectives:

Table 2: Our Four Objectives

Table 2: Our Four Objectives

Okay, now that we have our objectives, our next post will do a deep dive into the data we need to crunch.  By definition, every big data project involves data (big or small and keep in mind that size is just one of the V’s to consider). It goes without saying (but we will) that a majority of the resources we’ll need will be data—of course platform technology (lots of issues to suss out here), partners (how we might leverage our channel ecosystem for institutional knowledge, technology, etc.), and people (it’s time to figure out the skillsets we’re going to need) will also play major roles. But first, we’ll be talking about the data—what we need and where we are getting it. Oh and you’ll also get introduced to some of our partners who are providing all that lovely data!

Entry filed under: Big Data Project. Tags: , .

Enterprise Software in the Cloud: Why We Chose Azure as our First PaaS Platform Microsoft News Center: PatternBuilders brings big data analytics down to size

1 Comment Add your own

  • 1. sandhaya  |  December 10, 2013 at 3:29 am

    Hello

    I liked your blog post very much and the way which you people had presented here is really awesome one.Objective,Resources and Tasks : These are the three things which are very important to initiate and complete one Project.As you talked about the Big Data Project so, now a days Big data is a challenging lead in data science and It really means taking advantage of new technologies, tools, and new ways of thinking about problems.

    big data enables you to analyze more accurately, make more confident decisions and realize better operational efficiencies

    Thank you for this valuable content.
    We will wait for the next blog post in this series.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Trackback this post  |  Subscribe to the comments via RSS Feed


Video: Big Data Made Easy

PatternBuilders Corporate

Follow us on Twitter

Special privacy section!

Enter your email address to subscribe.

Join 56 other followers

Recent Posts

Previous Posts


Follow

Get every new post delivered to your Inbox.

Join 56 other followers

%d bloggers like this: