I'll tell you what's going on with the website. Straight from our provider:
22441. Power Failure (Pending Auto-Close) Opened: 52 minutes ago Updated: 4 minutes ago Details:
Our provider has had a significant power failure due to a malfunctioning breaker. They are working to repair it as quickly as possible. This affected all of our sites and services, as well as everything else in their building including their own sites, email, and phone systems.
At this time they have restored power and we are working to bring all of our equipment back online. This includes performing a full filesystem check on our master fileservers, which will take some time.
They have warned us that it is likely that they will need to interrupt power again in about an hour to replace the failed breaker. We are hoping that with added warning we can arrange a clean shutdown and avoid the need to restart the filesystem checks.
We apologize for this issue and we are working as hard as we can to get everything back online as soon as possible while avoiding the chance that we'll end up right back here as soon as we finish.Response:
From: Jeff
Date: 2007-07-17 13:09:43
We are working on configuring member sites to display a more useful error message while we wait for full service to be restored, and working on restoring full email and DNS service.
We are also exploring the viability of renting a truck and just moving all our equipment to our new datacenter.
Wow, they' willing to rent a freaking truck to try and get their customers back online. Now that's some damn dedication.
(no subject)
Date: 2007-07-17 01:33 pm (UTC)Of course, you might only get the redunancy you pay for. Dual power-feeds would probably cost you more. It's rare that circuit breakers malfunction!
Good luck and hope things come back soon :)
(no subject)
Date: 2007-07-17 01:36 pm (UTC)I'm not sure. It seems to be more of a datacenter issue since their own webservers and phone systems were affected as well. AFAIK, our webhost did due diligence on their end of things.
(no subject)
Date: 2007-07-17 01:50 pm (UTC)(no subject)
Date: 2007-07-17 02:13 pm (UTC)(no subject)
Date: 2007-07-17 02:20 pm (UTC)They kick ass.
(no subject)
Date: 2007-07-17 03:36 pm (UTC)Unless they have a very small amount of equipment, the viability isn't that high. Not to mention the specialized equipment needed to safely move a chunk of hardware systems.. You can't just load them on a uhaul, your failure rate would be too high. Not to mention insurance costs. If they move them themselves, they likely don't have insurance on the hardware.
Smells like a "it sounds good so will generate good will but we don't have an intention to follow through" kind of statement.
I could be wrong though.
(no subject)
Date: 2007-07-17 03:41 pm (UTC)Their webhosting environment is clustered, so a few equipment failures wouldn't kill them. (Note that www.anthrocon.org has 3 IP addresses, for example)
It looks like that won't be happening though. It turns out that the "outage" was planned maintenance gone horribly wrong, and the data center did not inform their customers of any of this until after the fact.
Many customers are irate, to say the least.
wow!
Date: 2007-07-17 04:27 pm (UTC)(no subject)
Date: 2007-07-17 04:55 pm (UTC)RUN!!!!
(no subject)
Date: 2007-07-17 04:59 pm (UTC)(no subject)
Date: 2007-07-17 05:29 pm (UTC)Hah, like the time that an idiot from the air conditioning vendor decided to flip the plastic cover and push the Big Red "EMERGENCY STOP" button on one of our PDUs, taking out half of System X?
Of the idiot from the UPS vendor who decided to put the datacenter into external bypass so he could tighten a bolt in the UPS, but hadn't activated the external bypass panel, so he shut down the entire DC at once? Then realized he goofed, and immediately turned it back on again?
Good times.
(no subject)
Date: 2007-07-17 05:38 pm (UTC)gotta aggree
Date: 2007-07-18 01:40 pm (UTC)Re: gotta aggree
Date: 2007-07-18 01:55 pm (UTC)- This was not a surge protection issue, it was data center maintenance gone horribly wrong.
- 24-48 hours of downtime? Er, no. Their systems were back in 3-6 hours
- They never said their entire system was down. They did however state that not everything was running due to the outage.
- How can you make a special error message when machines are down? Two words: reverse proxy (http://www.squid-cache.org/).
- They've gotten DoSed before, and have been up front with us customers about it.