Posts filed under ‘Technology’

Big Data and Cloud not a fit? Comments on Infoworld Article

By Terence Craig

Since Disqus seems to have completely eaten (bleh) my comment on @davidlinthicum’s very interesting InfoWorld post – Big data and the cloud: A far from perfect fit, I decided to just expand my comments and make a short blog post out of it. IMHO the problems that David is describing are more a reflection of problems with batch oriented technologies like Hadoop (more on my take on Hadoop here) in the cloud than a general problem for cloud based big data solutions.

Computing always has, and probably always will have, a bias towards creating batch focused technologies at the beginning of any large paradigm shift.   But as new technologies are absorbed, understood, and move from early adopter to more mainstream use, the batch paradigm will inevitably start to shift to streaming and real-time. We have seen this again and again (from punch cards to touch sensitive tablets, downloaded media to streaming media, DOM to SAX parsers, HTML to Ajax, paper maps to real-time GPS). The reason this evolution almost always occurs is simple: humans live and think in real-time and when our tools do as well we are more productive and happier.  So why do we have this bias for batch processing in our first generation computational technologies? Simply put, because batch processing is a lot easier.

(more…)

February 23, 2012 at 3:01 pm Leave a comment

No, Hadoop Doesn’t Own Big Data Analytics!

By Terence Craig

A number of folks have asked me if I was concerned about Microsoft’s  recent announcement that they would be partnering with HortonWorks and abandoning their own distributed processing technology for Hadoop.  While I thought this was an unfortunate choice on Microsoft’s part (the Dryad project’s implementation of multi-server Linq was pretty compelling), since HPC is a small part of Microsoft’s business, it probably made sense from a business standpoint.   In any case, we (as in all of us at PatternBuilders) are not concerned and just to be clear: we don’t believe that this announcement (or any other) means that the many Hadoop ecosystem players own the still forming big data analytics market.

That is not to say that the announcement isn’t proof of the strength of the Hadoop ecosystem. Hadoop is a nifty technology that offers one of the best distributed batch processing frameworks available, although there are other very good ones that don’t get nearly as much press, including Condor and Globus.  All of these systems fit broadly into the High Performance, Parallel, or Grid computing categories and all have been or are currently used to perform analytics on large data sets (as well as other types of problems that can benefit from bringing the power of multiple computers to bear on a problem). The SETI project is probably the most well know (and IMHO, the coolest) application of these technologies outside of that little company in Mountain View indexing the Internet. (more…)

December 12, 2011 at 1:41 pm 3 comments

Video: Big Data Made Easy. Sticky – see below for latest posts.

November 15, 2011 at 9:49 am 5 comments

All Together Now: All You Need is a Text Box!

By Terence Craig

All you need is text, Text is all you need (sing to the tune of The Beatles’ All you need is love).   If you are one of our regular readers you will remember that several months ago I wrote a manifesto on what the perfect analytics system would look like.  One of the last points was:

It must be as accessible as Excel (still the number one analytics tool in the world).

I was wrong – Excel is the number one non-specialized analytics tool in the world but in terms of usage, it is dwarfed in comparison to a very well know specialized analytics toolkit. The creators of this tool are a little company that you may have heard of:  it does no evil and analyzes the Internet to bring you back everything on the web based on a simple text query.  But behind that simple text box, Google has one of the most sophisticated analytics infrastructures in the world:

  • It can deduce your interests.
  • Give you the most relevant results.
  • And show you appropriate information based on them, as well as bring back highly personalized ads.

Google is not only the largest big data analytics company in the world, but it also has the easiest to use tools—proof that text is all you really need!

(more…)

October 14, 2011 at 3:22 pm 4 comments

Steve Jobs – The man who bought style to computing

By Terence Craig

Although I never met the man – I think that I and every programmer or entrepreneur that has worked in the valley felt like we had a personnel relationship with Steve Jobs.   He was without a doubt the most polarizing technology figure in the valley – known for his brilliant design sense, ability to excite an audience, uncompromising desire to get it right, and pithy emails.

My first real computer was a Mac.  That Mac Plus with an additional acoustic coupler modem – (a blazing fast 300 baud baby!) helped pay my way through college writing other peoples programs for them uh, I mean tutoring other students. The Mac was amazing it showed us that computers could be fun, quirky, and artistic. It introduced stylistic concepts that we are still having trouble bringing into mainstream computing today.  In a world of VT220 terminals and ascii art (btw the link is amazingly cool ascii), the Mac with Steve as her father proved that the digital world could be thrilling as well as functional. For that we all, whether in technology or otherwise, owe him a great debt.

Finally, lets all remember that despite his laudable achievements, Mr. Jobs was a human being who had family and friends that are mourning a man that cancer took away from them at an early age. While we can and should honor his many achievements, let’s not forget to take a breath and send good thoughts to them and all the other families who have been stricken by this deadly disease.  Or better yet, donate to the Cancer charity of your choice.

RIP – Steve.

October 7, 2011 at 6:28 am Leave a comment

No-SQL – Going All The Way

Going All The Way

We have recently made a big architectural change concerning our storage back-end and I wanted to talk about it.

Storage is key to any Big Data problem. As we’ve mentioned in prior posts, most of our performance bottlenecks and optimizations have to do with storage performance and architecture, as opposed to computation. Our architecture for the last few years has consisted of a hybrid approach with “no-SQL” analytics storage using MongoDB and “non-transactional” data stored in a traditional RDBMS, primarily SQL Server. There were a couple of reasons for this architecture. First, we started off entirely in RDBMS-land, because our initial design was done before no-SQL systems were really at a production-level of maturity. Second, most of our customers and prospects had traditional schemas and data organization – making integration easier if we could just use the same object model. (more…)

September 28, 2011 at 4:18 pm 1 comment

Real-time Analytics: It’s Always Decision Time!

By Mary Ludloff

Greetings all! I just came across a great video from eWEEK talking about the growing need for real-time (aka streaming) analytics:

“For years, business intelligence has provided valuable information to help executives and managers make decisions to increase sales, improve operations, and seize new business opportunities. With the quickening pace of business today and the need to make faster decisions based on more timely data, companies are complementing this data using information mined from social networks, mobile sensors, and even location-based information from smartphones. To get the best value from this wealth of new data sources, the data analysis must be done in real time. This allows decisions to be made based on the true conditions at that particular time.”

(more…)

September 23, 2011 at 12:04 pm 4 comments

Older Posts Newer Posts


Video: Big Data Made Easy

PatternBuilders Corporate

Special privacy section!

Previous Posts


%d bloggers like this: