New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
False service down alert
I got 256MB box from easevps UK via this forum.
After they replaced network and changed IP because of huge DDoS, their monitoring system mailed me "Service Down Alert" repeatedly, about once a day. Jacob said these are false alerts but i think there are too much alerts in my mailbox.
How sensitive/verbose would be better for service monitoring? false-positive is always better than nothing?
Comments
Only you can answer this. If you host something mission critical you will want to know for every potential downtime. If not.. then regular notifications aren't so important, or what?
agreed if the monitoring system is my own.
but i cannot customize their notification policy. "service down alert" is hardly to ignore atleast for me.
How frequent would be acceptable for you? (if the alert is false or not.)
So it's enforced and you can't turn it off? If that's the case then you really don't need this. I would ask host if they can turn it off for me and if this wouldn't be possible I would just filter it - but carefully so that you still get other mails/invoices/etc... from host.
Our dev adjusted the monitoring for triple checks before marking the host as down. If you give me a IP I can bring up all the previous attempts by the monitor for the past 8 hours.
It could just be that your firewall is blocking the monitoring IP. Submit a ticket or just drop me a PM. I'll also ask the dev to make a opt-out function in the controlpanel.
It's minute intervals so if this is incorrectly spamming you I can imagine that it is quite annoying.
I don't think all of the alerts were false. for most alerts, my munin masters in US failed to probe the node on EaseVPS at the same time. and sometimes logged huge packetloss and long rtt. It is why I could not simply ignore them.
@Jacob
I think we've already discussed on Ticket #410875.
as far I know, my iptables rules do not block any icmp packets at this time. Because some iptables module s.t. xt_recent are missing from kernel, I cannot use dynamic firewall like csf on UK node.
I'm still seeing Success > [your_ip] and no down alerts I'll definitely check it more when I'm home though.
Ok, after checking this and going through some things with the dev. I'm pretty confident that any false alerts will not be sent now.
In the log, this is what it has displayed for the past few hours, The emails are also coming to a inbox monitored by us, so I'll be checking over that aswell.
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
Success > [your_ip]
You did not log when you sent the alerts to your clients? The latest few lines filled by "success" do not show any useful information.
I'll open the ticket up again. The monitor logs everything it does, success, failures and then emails sent out and if it did not need to send any at all.
We're always improving it, and will implement another update that was just made which will also be beneficial for the customer. We just need to perfect it's accuracy a little more. Processing a few hundred IPs per minute may be pushing it a little, but I'm sure we can sort that out aswell.
Edit: Also this is what we would see if a host become unresponsive:
Failed > 3.3.3.3
Sending alert to: [email protected] for host: 3.3.3.3
[1] => Service Down Alert: 3.3.3.3
[2] => This is an automated alert to inform you that 3.3.3.3 after multiple attempts to reach the IP Address via ICMP has become unresponsive.
I got new service down alert at
@Jacob Could you check the ticket? I replied yesterday but still not answered.
Hi, I responded earlier, my main computer give up on friday, I have ordered new parts for my new build which will be here next week so untill then support is going to be slower than usual.
I had to tether to my laptop earlier to respond to the tickets I left last night, was travelling..
Any urgent issues can be raised by calling us.
again... I'm wondering why you send these alerts if you know they are all false.
by easestatus.com (8.14.4/8.14.4/Submit) id r16D7oPe005889;
Wed, 6 Feb 2013 16:07:50 +0300
Date: Wed, 6 Feb 2013 16:07:50 +0300
From: root root@easestatus.com
Subject: Service Down Alert: 78.157.
Total downtime has already exceeded your SLA even if each downtime were <5min.
There is also a mistake on easestatus.com
It's maintenance, not maintainance.
@cause I got notified of a network drop for about a minute 1:14PM - 1:15PM GMT.
I will phone david shortly to see if we can get a second network drop on this server, and then bond both interfaces.
about a minute?
https://www.dropbox.com/s/c8n2pi13twzpjs1/ping_packetloss-pinpoint=1360151183,1360164143.png
@cause Hi, that is not accurate we have a one minute and then a five minute monitoring system setup both in different datacenters.
The five minute system did not alert us, and it also phones multiple people.