Posted September 2, 2009 7:35 am by with 9 comments

Tweet about this on TwitterShare on LinkedInShare on Google+Share on FacebookBuffer this page

gmail beta no moreMany of the readers of Marketing Pilgrim likely had a rough 100 or so minutes yesterday when Google’s popular Gmail application crashed and went dark via the web. Watching the Twitter stream of panic and rage caught in 140 character snapshots was amusing for a while but when everyone tries to ‘out Tweet’ the next guy with some witty musing on the event it gets real old real quick. As I sat and pondered life without Gmail for a while I was wondering if someone in Mountain View wasn’t lamenting the removal of the beta tag from the service earlier this year.

In looking for an explanation, it’s best to turn to the source. The old adage is that “It’s not that you have a problem but rather how you handle it that is most important.”, applies here in a way that Google would like to not repeat. Here’s some official words from the official Gmail blog

Gmail’s web interface had a widespread outage earlier today, lasting about 100 minutes. We know how many people rely on Gmail for personal and professional communications, and we take it very seriously when there’s a problem with the service. Thus, right up front, I’d like to apologize to all of you — today’s outage was a Big Deal, and we’re treating it as such. We’ve already thoroughly investigated what happened, and we’re currently compiling a list of things we intend to fix or improve as a result of the investigation.

The blog then goes on to explain the 5 W’s of the situation in layman’s terms and, in my opinion, provided an appropriate mea culpa as well as showing that there is work taking place to ensure that this would not happen again to the same degree. What was most interesting was the recognition that the way that the architecture was at the time of the failure caused the shutdown rather than a slowdown and that Gmail is opting for slow service over no service for the future. Good choice.

What’s next: We’ve turned our full attention to helping ensure this kind of event doesn’t happen again. Some of the actions are straightforward and are already done — for example, increasing request router capacity well beyond peak demand to provide headroom. Some of the actions are more subtle — for example, we have concluded that request routers don’t have sufficient failure isolation (i.e. if there’s a problem in one datacenter, it shouldn’t affect servers in another datacenter) and do not degrade gracefully (e.g. if many request routers are overloaded simultaneously, they all should just get slower instead of refusing to accept traffic and shifting their load). We’ll be hard at work over the next few weeks implementing these and other Gmail reliability improvements — Gmail remains more than 99.9% available to all users, and we’re committed to keeping events like today’s notable for their rarity.

For something of this magnitude I give Google a decent grade for being transparent enough to say ‘Yup, we’re not perfect’ while working to get it right for the future. Today will be a great day for all of the Google haters out there. I on the other hand, have decided to realize that since I am far from perfect myself, that to expect from others is, well, a waste of time. Does that mean I will welcome future outages with open arms? Of course not. Based on what I have seen here though, I suspect that Google won’t either.

  • This is a huge fail for Google, considering how admired they are for all the technology they have built internally to scale out their applications.

  • I use Apps For Domain for everything – my contacts, my email, my todo list, my chat, my documents and more recently, my phone. As soon as it went down, I noticed in less than a second. I am now completely stuck, after a few months of being impressed by how I was able to run my entire life on Google.

  • I was lucky since I was out yesterday when Gmail went down. But if I am on my working hours during that moment, I would really feel so mad.

  • Hi Frank

    I wonder what kind of light this shines on cloud computing? if we take the experience of handmade jewellery into account? How many of us would rather have our data on our own machine? I know I do. I see cloud computing building out with a owner host/online host combo. In event of internet or cloud outages, users will still have access to their info through a intranet version of the service. This flexibility allows update and access to your data and services on demand from anywhere with piece of mind.

    The current state of internet infrastructure, while extremely good, should not be trusted for all of your data.

  • Pingback: Live-Point Official Blog » Blog Archive » Google Calls Gmail Outage a “Big Deal”()

  • A good reminder that no one is perfect, not even Google. I don’t use Google for email so I didn’t really pay attention. Did the outage only affect users of the free service? Does Gmail even have a paid service?

    If it was only the free service that was affected I always find it funny when people complain about that. If paid users had problems, they have a right to complain. My thoughts probably won’t be popular. You are of course welcome to disagree. cd :O)
    .-= ChrisCD´s last blog ..Highest CD Rates — September 2009 Update =-.

  • We have to realize that it takes a lot for a company to apologize for an outage lasting a little over an hour and a half. In today’s world, webpages have a habit of going in and out of service all the time, and as technology improves, the glitches get more intense. It’s important to see that Google did a very honorable thing by apologizing to the general public. Their blog post proves that they are dedicated to solving their issues, or at least put up a secure facade.

  • The outage was a royal pain in the butt, but I learned two valuable lessons about diversification and redundancy:

    1. Back up email contacts (in vCard format, via the “Contacts >> Export” option in Gmail) to your hard drive and maintain a copy on your local machine. I had a few memorized so I could still function during the downtime through an old hotmail account. I wouldn’t have lost any productivity if I had my full contact list ready to import at a moment’s notice.

    2. Diversify your service providers. This is the biggest reason I haven’t yet patched all of my phone and SMS communications through Google Voice.

    It’s a hard lesson to learn but worth repeating, “Don’t put all your eggs in one basket.”
    .-= Andrew Miller´s last blog ..Unveiling the Official LA2M Website =-.

  • Agreed…I did notice GMail was down that day, but I chalked it up to a slight glitch…and hour and a half is a long time for a big company like that, and a very big deal.
    .-= Gadget Sleuth´s last blog ..All-In-One Video Gadget: The ZINNET CinemaTube =-.