or how accurately do you need to estimate?
Windows Azure (previously codenamed Red Dog) is Microsoft’s foray into Platform-as-a-Service (PaaS). Rather than incurring large amounts of capital expenditure building, hosting and maintaining a mountain of infrastructure, Azure opens up the possibility of moving our applications to an environment of virtually limitless capacity (within reason) where organisations pay for only the resources that they use on an hour-by-hour basis.
Azure promises capacity to meet every conceivable demand for computing and storage (as long as it’s running on Windows) yet saving the consumers significant amounts of money. In these austere times, when all budgets are under considerable pressure, that’s a pretty appealing prospect.
The problem is that offering our customers a solution that reduces cost always leads to one tricky question:
Ooh, the killer question! There are, of course, a great many factors that influence the cost of bringing an application into service and running it for the entirety of it’s life expectancy. I’m going to touch on just the costs of delivering the infrastructure and running it for some arbitrary period of time.
There are two major influencers to the overall shape and size of any given infrastructure; the functions is must fulfil and the capacity it must provide.
Functionally, an infrastructure must include networks, routing, load balancing, computing resource, storage and backup capabilities. Many applications can operate perfectly effectively on a fairly standard or ‘commodity’ infrastructure. Others may have more exotic requirements; maybe some special hardware like a telephony system (PBX) or data gathering device. If this is the case for your application, you can forget hosting your application lock-stock-and-barrel in the cloud, at least for now. A hybrid cloud and on-premise solution may be a viable option, but I’m going to avoid discussion of such mongrels for now.
Computing in the cloud is about running your applications on an arbitrary slice of tons-and-tons of commodity infrastructure.
“Yeah, but how much?”
In my experience, customers need to know before committing to a project how much it’s going to cost them. Sizing infrastructure early in a project’s lifecycle is always a bit of a black art. Customers often don’t really know how many users their applications are going to support. They don’t know the blend of transactions that are going to be executed. We (the technical experts), don’t know how we’re going to implement what the customer’s asked for, either because they’ve not yet told us enough or we haven’t yet worked out how we’re going to build it.
Being the long-established experts that we are, we apply three key techniques to identifying the optimal infrastructure capacity for any given problem domain:
- The Fudge-Factor
The amount and the nature of the experience you have is obviously going to have a massive impact on the effectiveness of point 1 in this list. If this solution is to all intent and purpose one that you’ve delivered time and time again, you’ll have a really strong baseline against which to assess the cost. In my experience, unless you’re delivering vertical solutions, you never really do the same thing twice. Experience will help you estimate those aspects that are similar to things you have delivered before. Experience will also give you some inkling as to how hard the bits that you don’t know are likely to be.
After experience comes guesswork. Hopefully, you’ll know enough to keep the guesswork to a minimum, because this is where the biggest risk to the accuracy of your estimate hangs out.
Finally, we introduce the fudge-factor. This might otherwise be known as contingency. If your experience and guesswork leads you to the conclusion that your database server needs two processors and 4GB RAM, let’s specify four processors and 8GB RAM – just to be on the safe side.
If you’re lucky, the scale of the infrastructure you come up with will cope with the actual load (at least until you’re safely ensconced elsewhere). If you were smart, you’ll have architected the solution to scale by simply adding more hardware at a later date (never mind the cost).
Now, what happens if you over-estimate the scale of the infrastructure that’s required to effectively carry the load? Typically, nobody will ever know! All the customer will know is that the application continues to be provide the performance they expected yet cope admirably with the ever increasing number of users. They’ll never realise that the reason they’re having to cut investment in future projects is because your application consumes no more than 10% of the hardware that they’ve paid (and continue to pay) for.
Now, think about what happens when they’re being billed my Microsoft for running their application in the cloud.
The first problem is that, for the moment at least, hardly anyone has experience of the costs associated with running applications on Azure. We can compensate by increasing our reliance on guesswork and the fudge-factor. We can still produce a cost. And if our project goes ahead, one of three things will happen:
- Our estimate will be ‘in the right ballpark’ and everyone will be happy.
- Our estimate will be way too low and someone is going to get a really nasty surprise when the bills from Microsoft start rolling in.
- Our estimate will be way too high.
On the face of it, point 3 seems just fine. After all, we’ve set a level of expectation for the cost of running this application and it’s turned out to be much cheaper.
But let’s think about this more critically. When we put forward our estimate to the business sponsor, they (or maybe the CFO) had to assess the cost-benefit of proceeding with the project at all. Best case, you’ve made the business sponsor feel a fool for jumping through hoops to get financial approval for a project that may otherwise have slipped ‘under the RADAR’. Worst case, the project never got the green light anyway because the ROI wasn’t sufficient.
The other issue, the one that is more personal, is that you’ve made it blindingly obvious that you don’t know what you’re doing; your credibility and reputation are indelibly tarnished. I don't know about you, but even if I don't know what I'm doing, I don't really like to be found out!
When we’re designing for the cloud, we need to take a much more robust and scientific approach to estimate the operational costs for any solution that we assess. In part 2, I’ll explore some of the specific things I think we should be considering and wondering about just how we’re going to pull it off.
Aug 22 2010, 11:09 PM