Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Server Monitoring - the frustrations of Cry Wolf
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Server Monitoring - the frustrations of Cry Wolf

I have three providers for server monitoring and avail of their free services: Hetrix, Zilore and Uptime Robot.
In the main, they work fine and I monitor for the default web page of a hostname but..
Some servers fluctuate between Up and Down at regular intervals, sometimes alerting that they have been down for a few seconds (!), though downtimes of hours have been reported.
I run CSF and have been extending the ping rate from the default 1/s to 12/s (seems reasonable) but, and here's the rub, they shouldn't even be pinging the ICMP, only port 443 (the webserver). The upshot is that CSF sees these probes (to heck knows what other ports) as port scans and rightly blocks the source IP.
I've 'spoken' with Hetrix Tools Support in the past and they say that this is by design. I might add that in no way am I going to whitelist an external 3rd party, just because of their poor implementation. I want a single port (443, 80, ICMP, whatever) to be monitored and that is all.
Just today, I flushed the firewalls of two VPSes of around 1000 permanent blocks, just to get these monitors 'talking' to the servers again. That was short-lived!
Any one else experience these issues?

Comments

  • yes

  • Why not use the csf.ignore file for the IPs of Hetrix, Zilore and Uptime. That way you will still be notified if CSF picks up a block but CSF won't add the IP to the csf.deny file.

    Thanked by 1dedicados
  • @LeonDynamic said:
    Why not use the csf.ignore file for the IPs of Hetrix, Zilore and Uptime. That way you will still be notified if CSF picks up a block but CSF won't add the IP to the csf.deny file.

    That's the cop-out certainly and is otherwise known as whitelisting. Why should the b'stards be port scanning in the first place? Due to the continuous attacks on all servers, I turn off alerting for blocks anyway and most attackers get a permanent ban. Similar goes for the ignorant IP neighbours who allow their servers to send out broadcast packets (predominantly Windows and Plex).
    I get enough alerting emails in my inbox without CSF saturating it.

  • vimalwarevimalware Member
    edited June 2019

    I don't get any false positives on zilore when monitoring https and ssh endpoints(1min).

    For ssh I use 3 attempts max per 10/20min in fail2ban.

    Thanked by 1AlwaysSkint
  • jhjh Member

    We have a cron that gets our monitoring company's IPs and writes them to a file, then include that file in csf.ignore/allow depending on requirements. It's not that hard.

    Thanked by 2vimalware NodePing
  • AlwaysSkintAlwaysSkint Member
    edited June 2019

    @jh

    ..include that file in csf.ignore/allow depending on requirements. It's not that hard.

    It's a piece of piss to add in entries to csf ignore/allow - that is not the point. They shouldn't need to be added in the 1st place if they didn't do port scanning. Why should you ignore an external source from, in effect, attacking your server? CSF typically tracks 10-12 ports for scanning attempts, so if they're only scanning, say 443, then the trigger wouldn't come into effect.
    @vimalware - I use CSF to monitor ssh attempts, so no point in having fail2ban do that task too. I presume that you don't have any other form of intrusion detection. :-/
    I'm beginning to think that a simple 'mesh' of ping/webserver tests across my various VPS is the way forward. At least that way one knows what packets are being sent/responded to.

  • MasonRMasonR Community Contributor

    AlwaysSkint said: I've 'spoken' with Hetrix Tools Support in the past and they say that this is by design. I might add that in no way am I going to whitelist an external 3rd party, just because of their poor implementation. I want a single port (443, 80, ICMP, whatever) to be monitored and that is all.

    AlwaysSkint said: That's the cop-out certainly and is otherwise known as whitelisting. Why should the b'stards be port scanning in the first place? Due to the continuous attacks on all servers, I turn off alerting for blocks anyway and most attackers get a permanent ban.

    I think you're misunderstanding what Andrei told you. The monitoring services aren't port scanning. Because they are probing your port every x minutes (as part of the uptime check), your security mechanisms are believing that the monitoring services are port scanning your machine, which simply isn't the case.

  • AlwaysSkintAlwaysSkint Member
    edited June 2019

    Actually, looking specifically at HetrixTools diagnostics..
    The monitor is set to webserver (443) but it also pings and does a MTR, even though that wasn't requested - so there's at least 3 ports being probed. It's not a belief, it's fact.
    I'm in the process of deleting so-called webserver monitoring, in favour of only a ping request. Let's see if that request is adhered to.

  • HetrixTools suspended, as still too many false alerts even with ping. HostDoc's 'Gold' timing out to New York & London - I don't think so!

  • AnthonySmithAnthonySmith Member, Patron Provider

    AlwaysSkint said: The monitor is set to webserver (443) but it also pings and does a MTR, even though that wasn't requested - so there's at least 3 ports being probed. It's not a belief, it's fact.

    Which port is being probed with ICMP then?

    Thanked by 1AlwaysSkint
  • @AnthonySmith said:
    Which port is being probed with ICMP then?

    Yeah, I know, not strictly correct but I perhaps wrongly assume CSF is counting this in its' tally.
    Setting up Nagios on my freebie FinalHosting server

  • AnthonySmithAnthonySmith Member, Patron Provider

    Not really how CSF works, anyway, I think you are being a bit hard on them expecting them to develop around every possible third party firewall application, even more so when the firewall has literally given you a method of fixing this in 2 seconds.

    I know that is not the point, but really, in the grand scheme of things, it does the job and for free or next to free I would say that is good enough.

    I gave up on all third party monitoring platforms long ago, none of them are 'great' they either totally lack meaningful information or generate so many false positives it impacts your life on a daily basis.

    There is 1 exception but I always forget the name of it, @oliver was the one that put me on to it.

    It is a monitoring platform that thought of everything, it makes the rest look like a child's toy and has responsive developers and REALLY never gives false positives.,

    But as you would expect it is very expensive compared to the toy monitoring services out there.

    So with that in mind I just wrote my own, simple token based system that requires agreements in order to alert, so essentially my own infrastructure monitors itself, as I have presence in 3 countries that works just fine.

    Thanked by 1AlwaysSkint
  • For full disclosure, server management/monitoring used to be my speciality for one of the big 4 IT companies, so I'm well aware of deploying agents etc. but it's well over-the-top for a simple ping response or site check.
    On a freebie monitor, I'm quite happy with say a 5 minute check, it's not like the websites that I host are mission critical. I only run a small scale operation. ;)
    Your token system does sound interesting and I'm steering closer to a DIY solution.

  • I love HetrixTools but the thought of creating my own monitoring solution has crossed my mind a lot, especially with the mountains of low priced vps in different locations available so I can definitely agree with you there.

    Problem is do you reinvent the wheel or find something on github and hack it until it works? Lol

  • perennateperennate Member, Host Rep
    edited June 2019

    We use our own in-house simple uptime monitoring system to monitor our own servers and haven't had any issues. It's available here: https://github.com/lunanode/gobearmon (you just need three or four servers to set it up)

    You can configure monitoring interval (e.g. once per five minutes) and notification delay (e.g. only notify after it's been down for three monitoring intervals). On each monitoring interval, if downtime is detected, it must be confirmed by two other servers.

    It only monitors uptime though, not processes, so if you actually deploy this you'll want a separate system that monitors whether the monitoring system is actually running on each server that you deployed it on ; ).

    (If you're open to trying it out, could give you a free account on our platform for uptime monitoring.)

    Edit: to be clear, this is simple uptime monitoring (but made robust with redundant checks and such), you can configure it to send you an alert (e-mail, SMS, etc.) when something goes down but historical data and other functionality are very limited.

    Edit2: oh yeah also I run a free service https://bearmon.com/ but this is an older version (https://github.com/uakfdotb/pybearmon) and not quite as reliable. Similar design though, and it's free.

  • First impressions of Nagios are good - I installed from source using the supplied PDF, then upgraded (I should've checked!).

  • @perennate Thanks! Cracking offer - lemme see how I get on with Nagios first. Your offer would certainly save quite a bit of time, over installing on my servers. :-)

  • XsltelXsltel Member, Host Rep

    I prefer Zabbix for monitoring stuff, graphs, etc..

    Thanked by 1AlwaysSkint
  • HBAndreiHBAndrei Member, Top Host, Host Rep

    @AlwaysSkint I'm sorry you've been having such a bad experience with our monitoring nodes being blocked in your firewall. This may be the result of our platform performing Network Diagnostics (taking PING and MTR samples when downtime is detected). If you wish to disable this feature please open a support ticket.

    Cheers.

    Thanked by 1vimalware
  • NeoonNeoon Community Contributor, Veteran

    Same as @perennate I am using a selfmade software since 2017 to monitor my servers externally.

    I ditched Pingdom and Statuscake a while back, since they only offer 5 minute intervals, which is far to high to monitor anything in my opinion.
    Props to @perennate to offer 60s for free, most don't.

    The rest wanted $$ money to monitor 30+ servers which was a joke and still is in my view.
    So Night-Sky was born, works fine for my needs.
    https://github.com/Ne00n/Night-Sky

    TCP/HTTP checks down to 10s, if you like big log files.

  • AlwaysSkintAlwaysSkint Member
    edited June 2019

    @HBAndrei
    Thanks for reaching out. I have asked previously about this in Tickets 8152481916 & 3926052609.
    As for CSF, the default PS_PORTS is 0:65535,ICMP (as I suspected above) - it'll be like that for a reason. It's tempting to add UDP broadcast too. ;)

    Nagios is going well monitoring both ping and HTTP on one remote server. No triggers.

  • vimalwarevimalware Member
    edited June 2019

    Sourcegraph commissioned MattHolt (of caddy fame) to build this in Golang a while ago : https://github.com/sourcegraph/checkup
    It looks like a healthcheck page. I don't know if it includes notification triggers.

    Looks like the easiest to deploy (single binary)

    Thanked by 2AlwaysSkint uptime
Sign In or Register to comment.