To Cloud Or Not To Cloud
When we started PatternBuilders, we made what was then an unusual decision: to avoid multi-tenancy as I talked about here. However, we also decided to avoid the cloud because we wanted to have predictable costs and felt that given the high level of expertise we had internally with managing data centers, we would be better off investing in top tier colocation facilities. This made a lot sense given the security sensitivities of our initial target markets: internal IT at the Fortune 500, large retail suppliers, and hospital groups. It was also an economically viable choice because our business model provisions hardware and bandwidth for each customer after the sale to manage cash flow. We also knew that we would be able to reduce both the cost and maintenance headaches of separate customer provisioning by aggressive use of virtualization technology, much like the cloud server vendors Rackspace, Amazon, and others do today.
This all worked out pretty well until we decided to beta our new social media analytics vertical (described here) publicly on the web (we will start accepting beta signups this week). Suddenly, we had a non-revenue generating (but critical) resource that we needed to have sufficient capacity for. It didn’t make sense to make a massive investment in infrastructure, so we went back to investigating the state of cloud servers.
Amazon or Rackspace seem to be the best fit for us (after this week’s Amazon outage, we may set up on both for redundancy). But we have a slight preference for Rackspace because of the quality and simplicity of their provisioning application. In case you were wondering, the costs appear to be about the same for both.
The interesting bit, from an engineering standpoint, was the change that cloud servers made in our architecture. Since bandwidth is a large component of the price of any cloud server application, we started using compression almost everywhere. One big surprise was how cheap compression was both in terms of CPU and memory. We also fully embraced MongoDB’s sharding capabilities, which we only used on an ad-hoc basis previously for particularly data intensive projects. Since we wanted to keep the core service on our colo infrastructure, we added the ability to automatically spin up an instance on a cloud server, with its own mongo shard and our analytic engine, to deal with growth spikes. The nice thing is that we can offer that same ad-hoc capability to other customers that have decided to keep the solution on premise, but need quick additional capacity for new data or a special project. Because most customers who want on-premise have security concerns, we also implemented a methodology that allows analytics to be done on an encrypted data set that can only be decrypted within their firewall.
We are very excited about both our social media analytics product and the resulting changes to our architecture. The architecture changes are particularly exciting since it enables a whole new line of Web 3.0 business for us. We look forward to showing you both soon.