Monday, March 21, 2011

Downside of the Cloud and Hosted solutions

I had the occasion to stop off at one of Chicago's premiere theatres recently.  I didn't really want to go there, given the weather, but I did want to secure tickets for an upcoming show.  I hadn't been able to get to their ticketing web site for the last 3 days.  Thirty minutes later, I left with my order reservation and a promise that I could come back and pick up my tickets once they were able to charge my credit card.  The person behind the ticket counter informed me that their servers were inaccessible due to a problem with their Internet connection, which has been more down than up for most of the week.  On the upside, the theatre is just down the street, so that won't be very painful...  for me.

Like most theatres these days, they have either outsourced, or hosted their ticketing system offsite to simplify their cost structure and to make it more accessible to customers.

Like most businesses with the datacenter outside the building, they are dependent upon their Internet Provider, and, as it turns out, that is where the problem lies.

My guess is that they have redundant connections, but that doesn't help when the problem is related to issues at the datacenter.  The potential causes are many:
  • Indifferent or incompetent engineers/admins/management
  • Bad documentation
  • Growth (in traffic levels, number of sites or servers hosted)
  • Reliance on marginal or 'past live' components in the network
  • Hardware failure
  • Insufficient or missed monitoring or audits
  • Accident or fire
  • Untested failover scenarios
So while the cloud, and outsourcing can reduce Asset valuation and payroll obligations on the balance sheet, it can also lead to increased downtime if not properly designed, implemented, documented and most importantly, tested.

A key facet of reducing this downtime on the client side is redundant IP connections.  But to make this work, you have to test it and verify that failover can occur smoothly, without loss of a transaction (short delays are usually acceptable).

However, on the server side (hosted/cloud), there isn't much you can do.  You are at the mercy of the hosting/provider's ability to support their product.  Even if you provide the circuit(s), they still have to get it/them connected - safely, securely, and reliably - to your servers.  This is no mean trick.

So if you do decide on a cloud or hosted solution make sure you do the following:
  • Research your prospective provider thoroughly.
  • Talk with their other clients
  • Document every aspect and procedure
  • Test, test and test
  • And test some more
Lastly, don't forget to define and test a procedure that you will use when the solution eventually fails, which it will.  I leave you a few mantra's of IT Directors everywhere:
  • Murphy is the patron saint of computing.
  • He who has physical control of the assets, rules.
You need to allow for one and obtain the other.

Till next time...

No comments:

Post a Comment