Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


False service down alert
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

False service down alert

causecause Member
edited January 2013 in General

I got 256MB box from easevps UK via this forum.
After they replaced network and changed IP because of huge DDoS, their monitoring system mailed me "Service Down Alert" repeatedly, about once a day. Jacob said these are false alerts but i think there are too much alerts in my mailbox.

How sensitive/verbose would be better for service monitoring? false-positive is always better than nothing?

Comments

  • SpiritSpirit Member
    edited January 2013

    @cause said: false-positive is always better than nothing?

    Only you can answer this. If you host something mission critical you will want to know for every potential downtime. If not.. then regular notifications aren't so important, or what?

  • causecause Member
    edited January 2013

    agreed if the monitoring system is my own.
    but i cannot customize their notification policy. "service down alert" is hardly to ignore atleast for me.

    How frequent would be acceptable for you? (if the alert is false or not.)

  • SpiritSpirit Member
    edited January 2013

    So it's enforced and you can't turn it off? If that's the case then you really don't need this. I would ask host if they can turn it off for me and if this wouldn't be possible I would just filter it - but carefully so that you still get other mails/invoices/etc... from host.

  • Our dev adjusted the monitoring for triple checks before marking the host as down. If you give me a IP I can bring up all the previous attempts by the monitor for the past 8 hours.

    It could just be that your firewall is blocking the monitoring IP. Submit a ticket or just drop me a PM. I'll also ask the dev to make a opt-out function in the controlpanel.

    It's minute intervals so if this is incorrectly spamming you I can imagine that it is quite annoying.

  • I don't think all of the alerts were false. for most alerts, my munin masters in US failed to probe the node on EaseVPS at the same time. and sometimes logged huge packetloss and long rtt. It is why I could not simply ignore them.

    @Jacob
    I think we've already discussed on Ticket #410875.
    as far I know, my iptables rules do not block any icmp packets at this time. Because some iptables module s.t. xt_recent are missing from kernel, I cannot use dynamic firewall like csf on UK node.

  • I'm still seeing Success > [your_ip] and no down alerts I'll definitely check it more when I'm home though.

  • Ok, after checking this and going through some things with the dev. I'm pretty confident that any false alerts will not be sent now.

    In the log, this is what it has displayed for the past few hours, The emails are also coming to a inbox monitored by us, so I'll be checking over that aswell.

    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]
    Success > [your_ip]

  • causecause Member
    edited January 2013

    You did not log when you sent the alerts to your clients? The latest few lines filled by "success" do not show any useful information.

  • JacobJacob Member
    edited January 2013

    I'll open the ticket up again. The monitor logs everything it does, success, failures and then emails sent out and if it did not need to send any at all.

    We're always improving it, and will implement another update that was just made which will also be beneficial for the customer. We just need to perfect it's accuracy a little more. Processing a few hundred IPs per minute may be pushing it a little, but I'm sure we can sort that out aswell. :)

    Edit: Also this is what we would see if a host become unresponsive:

    Failed > 3.3.3.3
    Sending alert to: [email protected] for host: 3.3.3.3
    [1] => Service Down Alert: 3.3.3.3
    [2] => This is an automated alert to inform you that 3.3.3.3 after multiple attempts to reach the IP Address via ICMP has become unresponsive.

    @cause said: You did not log when you sent the alerts to your clients? The latest few lines filled by "success" do not show any useful information.

  • I got new service down alert at

    Date: Thu, 31 Jan 2013 09:20:27 +0300

    @Jacob Could you check the ticket? I replied yesterday but still not answered.

  • Hi, I responded earlier, my main computer give up on friday, I have ordered new parts for my new build which will be here next week so untill then support is going to be slower than usual.

    I had to tether to my laptop earlier to respond to the tickets I left last night, was travelling..

    Any urgent issues can be raised by calling us.

  • causecause Member
    edited February 2013

    again... I'm wondering why you send these alerts if you know they are all false.

    Received: (from root@localhost)

    by easestatus.com (8.14.4/8.14.4/Submit) id r16D7oPe005889;
    Wed, 6 Feb 2013 16:07:50 +0300
    Date: Wed, 6 Feb 2013 16:07:50 +0300
    From: root root@easestatus.com
    Subject: Service Down Alert: 78.157.

    Total downtime has already exceeded your SLA even if each downtime were <5min.

  • IshaqIshaq Member
    edited February 2013

    There is also a mistake on easestatus.com

    It's maintenance, not maintainance.

  • JacobJacob Member
    edited February 2013

    @cause I got notified of a network drop for about a minute 1:14PM - 1:15PM GMT.

    I will phone david shortly to see if we can get a second network drop on this server, and then bond both interfaces.

  • causecause Member
    edited February 2013

    about a minute?
    https://www.dropbox.com/s/c8n2pi13twzpjs1/ping_packetloss-pinpoint=1360151183,1360164143.png

    @Jacob said: I got notified of a network drop for about a minute 1:14PM - 1:15PM GMT.

  • @cause Hi, that is not accurate we have a one minute and then a five minute monitoring system setup both in different datacenters.

    The five minute system did not alert us, and it also phones multiple people.

Sign In or Register to comment.