A few months back I decided to move my Web hosting over to Media Temple for several reasons… More space, ability to host multiple Web sites on a single account (and not just with domain forwarding) along with a lot more disk space. However – the real cool part of MT is the usage of their Grid-Service (GS) platform. Lately my own Web sites have been affected by some performance issues just like when the Grid-Service was put online… Latent Web site performance, seemingly numerous maintenance windows, etc. Knowing that I really only pulled in just over 3100 visits from around 2100 users in November (13 GB transfer) really meant that such outages weren’t really that much of an impact on my low volume Web sites.
However – this observational story of mine comes in two parts…
Part One – What is Grid Computing?
You would think that with a grid computing model that MT has in place would allow individual Web sites for flip around to different servers and I should “never” seen an outage… So why am I impacted? Is it really a grid system? Well – MT doesn’t really release any information about the technical nature of their hosting plans… There’s some nice “feel-good” information online that really sells the product… But again – light on details.
Now, for my daytime job – I architect & engineer fairly large WebSphere XD clusters for a large financial company. Probably have $10M spent on software and hardware alone… We’re clustered all over the place, N+1, no single points of failure… blah blah blah – and it really seems to work… Most of the time… However – just like MT – we have single points of failure when it comes to infrastructure outside of the application hosting model… For MT – this now seems to be related to their storage systems. So with great designs – decisions as to what gets engineered for different components really make the difference. I’m guessing that a lot of effort went into storage planning at MT – but when you build systems like this for a living (or for a company’s hosting plans) – engineering efforts really just shift the bottlenecks from one system to another. Fix the application layer and you could start having issues with database. Fix the database and you highlight storage performance issues (see a trend here?). It’s a moving effort – and in terms of how technology works – your customers continuously see “issues” all of the time… You have to set the expectation that those items happen… That is unless you get yourself into a circular path of destruction (whaa ha ha ha!!) or if you’re Google…. Okay – next point…
Part Two – Taking Responsibility
Even before I moved my Web sites to MT I rambled a bit about how I liked MT’s company philosophy on taking responsibly. Today there was a note simply called “We Apologize” from Demian Sellfors – CEO of MT.
Media Temple would like to apologize to our (gs) Grid-Service customers for the series of issues relating to the (gs) system in the past few months.
The situation with the storage upgrade is particularly frustrating because the vendor supplied update was intended to fix issues – not create new ones.
This one hits me at work since we fight such items with our own vendors… However, the company I work for (and MT) have a responsibility to provide hosting solution to our business partners. Failures, no matter if by a vendor or by our own actions, are still our responsibility. In this case, MT informs their customers about what has happened, what they’re doing to fix it, and an apology for the effect their service problems have had on their customers. Granted, MT is offering 2 months of credited hosting to help lesson the damages… But I wonder – do most corporate management teams offer their customers 2 free months of service for such ongoing problems? Do corporate customers actually “pay” for their IT services? Could you even offer a refund? Providing ownership and accountability for what services are provide really drive home the benefits and drawbacks of IT resources when there are real dollars involved…
In either case – it’s once again refreshing to see in writing efforts by a company to just do the right thing. (I do wonder what’s happening behind the walls at MT… perhaps not all of their team feel disclosure is best for business… I, for one, champion transparency when put into the right context…)