OH what a week

June 06, 2008 22:40

Unless you've been on an internet free vacation or exceptionally drugged for the last week, you might have noticed the site was down. No, there was nothing wrong with your internet. The problem was my server. Or more specifically, the building in which the server is housed. My hosting provider suffered a minor explosion in the building. A tiny little incident in the electrical room where that small insignficiant transformer exploded and took out only three walls in the process. For reasons that will never be fully explained, the fire department decided that the explosion and resulting fire were sufficiently good reasons to not immediately activate the backup generators in the building... because the power infrastructure might have been damaged... and they wanted to wait until things were inspected before turning the juice (temporary at that) back on... but hey.. what do they know.. right? :)

So fine.. it takes a day before they can begin to power the building back up, using portable (and permanant) temporary generators. While the second floor of the building was up and running fairly quickly (in like 30 hours or so), the first floor was in far worse shape electrically. Yes, this is where my server is located. It took them a couple extra days to restore power to that part of the building, and encountered some sanfus with the generators they were using. Apparently, 2 megawatt generators are both hard to come by, and complicated machines that are prone to various problems.

Very well, they get power back to the servers. They then start booting them up. This takes a couple more days. At some point, I realize my server hasn't come up yet. It is eventually determined that my server won't boot and will require an OS reload. That was part of today's project. In any event, we're back.

I have purchased a second server at another datacenter in another city which will at first keep a daily updated copy of this server, such that in the event something major like this happens again, or even if it's something simple and stupid like the site crashing (which happens far too often in and of itself).... I'll be able to keep the site going. Eventually I hope to load balance the site so that both servers handle a share of the load, and either can backup for the other should one of them go down. This will take a while to get working, as it will require some rather major changes to some of my software to do live syncs over the internet. More to come later.