|This week has been hell
||[Mar. 26th, 2009|09:25 am]
|||||MotionTraxx Podcast - Ep. 2||]|
Okay... seriously... What the heck is going on?
On Sunday I am at Sam's Club getting ready to eat some lunch with my wonderful wife and super-cute daughter when I get a call from my boss. He's at work rebooting a server out at our co-location facility and notices that something is not quite right with our storage server at our local office. He asks me, "Buck, is IISVR6 scheduled to reboot itself occasionally?" Well, it's not, and I really don't like where this line of questioning is headed. This machine was originally configured six years ago, and for some reason it was decided to stripe together the three internal drives into a single unit and then create two partitions on it. Why? Who the heck knows. It was a really unwise decision, and probably wouldn't be done that was again. Anyhow, the first drive in the stripe up and died. Completely. Well, everything on those disks is now gone. Luckily we had a full backup from the previous week. The current week's full backup was in progress when the system went down. 'Yay' to rotating weekly full backups. So, we were able to recover all the important missing data. In the mean time though, we had to get the box back up and running. At midnight that night, we finally had it running again, minus some filesystem permissions issues on the attached RAID-5 array.
So, on Monday, later in the day I find out that a customer wants a HUGE change made to the way something works on a website we manage for them. Yesterday... Actually, months ago, but never told us. So, I leave work at 4:30pm only to come back at 10:00pm and stay until 4:00am. Suck.
Tuesday morning, my boss and I get a couple things done and go for a quick walk down to the convenience store downtown to get some caffeine, since we're both freakin' exhausted, only to get a call from a guy at work saying all our websites are down, and nobody can get anywhere and our firewall is flashing and beeping. Great... It turns out that the firewall crashed. Hard. Complete hardware failure. Luckily we renewed the warranty on it last year through it's end of life (this October), so they shipped us a replacement. But it wouldn't get here until the next day, supposedly by 10:00am (as of a couple minutes ago, it's ETA is 10:30am).
In the mean time, all our web-based services, email, what-have-you are down and inaccessible. We manage to fudge some stuff and get some of our more important services accessible, but can't get everything going. Lots of unhappy customers. Yay...
I really hope the rest of the week is nice and boring. I've had more than enough excitement in my life this week...
Edit: Did I mention that the only machine that had the firewall configuration software on it was the machine that died over the weekend? And there is a three hour delivery delay of the replacement machine. FML.