Thursday, March 31, 2005


My service provider had a rather major outage this morning which impacted all services. It sure was fun. I don't know the details of what happened right now, but I'm sure I will find out. I do know that lost power and came back up, so I assume that they had some power-related issue just like the big boys over at livejournal :)

Anyway, there were a few small issues after the server came back up, but no worries - all appears to be well now. Look at it as a positive thing, now those problems won't happen again if the server gets randomly powered off and back on again. Let me know if there are any problems you see.

I also want to take a second to remind you all that I have a couple offsite backup servers, so your data should not be lost even if that server were to explode (it would just be terribly inconvenient and take a while to recover).


Here is the info on the outage from my provider:

Location: DLLSTX2

Severity: Level 5

Description: At 4:09AM CST, a power failure occurred in a redundant pair of Powerware UPS units feeding power to section B of the DLLSTX2 datacenter. The power failure was caused by a faulty fuse in UPS unit B-1. As the load transferred to UPS unit B-2, the spike in the load created an overloaded breaker and UPS B2 also lost power. This resulted in a power outage to the main power distribution unit feeding section B of the datacenter floor. Emergency teams were notified and the power was restored within 20 minutes. The power continued to cycle until 6:45AM due to the faulty fuse and the inability of the redundant UPS units to remain in bypass mode. Customers may have noticed several power cycles during this time period. Powerware, JT Packard, and electricians found the problem and replaced the faulty hardware and fuses. At 6:45AM, all electrical service was restored to normal and the NOC team began to bring all servers back online. The technical staff is currently placing a console on all servers to verify server restarts. Customers with operating systems that require a file check may have experienced extended downtime during the file check. Powerware and JT Packard will continue to monitor the UPS systems for the next 24 to 48 hours. The Planet does not anticipate any further outages.

No comments:

Post a Comment