Datacenter Crash (Server Unreachable for About 17 minutes)

Datacenter is trying to determine why the server when down.

All log files show no errors, including s yslog, dmesg, etc and there are no core files.

Dashboard and stats show no unusual activity and light CPU load prior to crash.

I suspect a power issue in the datacenter. The datacenter team is looking into it.

Thank you for your patience and sorry for the short outage.

reboot   system boot  4.15.0-33-generi Mon Feb 24 07:00   still running
root     pts/0        159.192.217.25   Mon Feb 24 06:37 - crash  (00:22)

I'll post back if they come up with a reason on the datacenter end.

My searches of the file system and logs show no errors or internal server issues.

Brief Update from Datacenter support:

Update 1:

Update 2:

Looks like "servers" in the first update means a datacenter issue with either power or network connectivity.

Searched all log files again... there is nothing indicating a crash except the active ssh session "crash".

My "uptime bot" shows the following 17 minute down time window (GMT+7):

Final update from data center:

Hmmmm.

The good news is that it was not a server error.

The bad news is that is was a data center network error.

1 Like

Is anyone else noticing that DNS is not working normally and globally?

I am seeing an unusual situation where DNS names in many domains are not resolving.

Ah .....

Turns out the datacenter is at fault again.

Time to move to another provider?

If we can believe that we'll believe anything.

Anyway, what the h*ll does that mean??

An emergency is......e.g. a fire........maintenance is........er!!......should be planned? How do you get a mix of those two???
What is "emergency maintenance"? Answers on a postcard please!!

The only excuse is a power failure and any decent datacenter should have a backup power strategy for that.

All the pro's on here know that systems should be designed clustered, high availability, etc (typical outage 20 seconds). Who are they trying to kid?

1 Like

Yep.... two days in a row of this nonsense from the data center.

And their response, the best they can do, is "thank you for understanding"..... (GMT +7)

Anyway, guess it is time to look for a new company and datacenter.

Seems Server4You is not the "Server4Us" anymore.

Have you any idea what equipment they are using?
I suppose X86 blade servers but what? and what kind of hypervisors...

Both outages have been due to networking issues in the datacenter. The dedicated server itself has been fine (I did reboot it once when the network was down).

I have all the details of the server of course but the server is not the problem. The datacenter is the problem.

They did contact me from billing and offer to waive the fees next month for two servers; but I have not responded yet.

After two outages in a single 24 hour period my "anger meter" went up dramatically and I am letting it go back down again before I reply back to them.

2 Likes

Here is the reply from the data center just now:

Hi,

I have the same 17 minute outage reported on a number of my domains, these are hosted servers - I'm guessing that they were all out but only the Wordpress sites with Jetpack logged an error (tested from Wordpress.com).

Regards

Gull04

We had an outage last week I think, can't remember officially around half an hour but sure it was more as I got fed up and went to bed... Why I am mentionning is that it looks very similar, Swisscom had a router etc maintenance where no disruption was expected, finally maybe a third (if not a half...) of Switzerland was without TV phones using swisscom box and no 4G... many 24/24 alarm (police/hospitals...) were out of order... In other words a net issue that propagated bringing the whole datacenter down

Thanks for sharing your miseries....

This has been a real PITA.....

I have written some very stern lectures to S4Y, telling them what I think about not notifying customers about data center upgrades; and for doing this during the week and not during the weekend, etc.

Their sales teams have expressed similar frustration, as I was not the only customer in the data center to be outraged at these two unschedule, unannounced, outages within 24 hours.

What the "heck" were they thinking?

What they said was "we did not think there would be a problem, sorry"; but when I used to run data centers back in the old days, we approached upgrades as

  1. Anything that can go wrong will"
  2. Schedule for the weekends and plan well in advance and
  3. Notify all customers who might be effected many days in advance.

I thought this was standard practice in all data centers!