Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Project idea: Services checker and re-starter
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Project idea: Services checker and re-starter

Go59954Go59954 Member
edited May 2012 in General

Hi,

I've had an issue earlier with one of my VPS where the issue became annoying: service collapsing causing server unresponsiveness. Which is apparently out of shortness in memory, with possibility of overselling at provider's side.

I've opened a thread about that issue. And came somehow to a suggestion that might be useful anyway, which is:

A local script that runs at fixed time interval with a cron job, and it check user's critical services (ie Apache/Nginx, Mysql server, DNS..) , and only send an email notification in the case it finds a service is down (and might be set to an email that forwards messages to an sms). Means: No notifications sent=Things are (supposedly) up and running.

However rds100 have suggested that:

@rds100 said: Local bash script is not the best solution, because it can die too, the same way as other services die. The monitoring must be remote.

That's true since running the script won't succeed at times if no memory is available or for other performance reasons. So a remote monitoring service was suggested.

However once more I've suggested a local script, but that also listens on a port and can be monitored externally with a 3rd party monitoring service that notifies by sms/email just when the script itself is down. As I see in this case the monitoring script needs to run all the time in the background, so probably no need for cron jobs.

So, the idea for now is:

A script runs locally on VPS, stays in the background and monitors user-specified critical services. Once any of it is down for x period of time, an email notification is sent to the user including the service that's down for x sequential number of times. OR, the monitoring script automatically tries to restart the service for x number of times before sending each notification. In the case the service is started successfully then a notification is sent stating the service went down but was successfully started- and no action required by user.

While in the case the monitoring script itself is down, and as the script itself listens on a port, an external 3rd party monitoring is set to monitor it's uptime, so an email notification is sent from the external monitoring service that indicating once the monitoring script/full VPS is down or not responding.

That's it. I thought I would just throw it for evaluation if it's useful or not. Plus the fact that I don't think I will be trying to do it by myself anytime soon.

Comments

  • AldryicAldryic Member

    You could take a look at @NickM 's OpenStatus. It does service monitoring, and email notification.

  • Go59954Go59954 Member
    edited May 2012

    @Aldryic said: You could take a look at @NickM 's OpenStatus. It does service monitoring, and email notification.

    Yes it can do it more or less, however the difference is this one supposedly runs on the same VPS locally. Also it shall try to start died service before sending each notification.

  • NickMNickM Member

    Using a script to try to restart a service that died can lead to even worse issues, which is why OpenStatus doesn't do it. For example, with MySQL, if you're using replication and it dies and tries to restart? You might end up with inconsistent data on your slaves and you'll have to sort it out. Granted, if you're using MySQL replication, you should know that and disable it, but still...

  • subigosubigo Member
    edited May 2012

    Sign up for http://www.uptimerobot.com and create a custom port check. Done and done.

    edit: Oh, you want something that automatically restarts the service too. Just use a simple bash script for that... or just use something like http://supervisord.org

  • dannixdannix Member

    If you need monitoring and restarting check monit

    Thanked by 1marrco
  • Go59954Go59954 Member
    edited May 2012

    @NickM said: Using a script to try to restart a service that died can lead to even worse issues, which is why OpenStatus doesn't do it. For example, with MySQL, if you're using replication and it dies and tries to restart? You might end up with inconsistent data on your slaves and you'll have to sort it out. Granted, if you're using MySQL replication, you should know that and disable it, but still...

    Thanks :)
    Well, that's a good suggestion if it can actually get included in OpenStatus. But I don't think it's better being considered problematic just out of probably a few cases where it leads to problems. I mean starting the web server, and DNS if used, both probably won't cause a problem, also Mysql server in most cases (given replication is the main trouble source) , so that can be included in Readme, in addition to a bold notice just in case. So maybe you better consider similar thing in the future updates ;)

    It would be great if service starting commands are added just by the user himself as much as needed, in configs. So if he didn't add starting command at Mysql line in configs, then OpenStatus won't try to start Mysql. And if he didn't add any starting command next to a list of services, then no services will be started (same as feature disabled). Also might add a note next to Mysql line in configs and other services that might cause troubles by automatically restarted.

  • Go59954Go59954 Member

    Sign up for http://www.uptimerobot.com and create a custom port check. Done and done.
    edit: Oh, you want something that automatically restarts the service too. Just use a simple bash script for that... or just use something like http://supervisord.org

    Thanks ;) I have an uptimerobot account, I'll be checking the other suggestion.

  • Go59954Go59954 Member

    @dannix said: If you need monitoring and restarting check monit

    Thanks for your suggestion and I might try monit. And even though there are full featured monitoring services that must be doing most of things, I was looking into a simple way to get it.

  • the simplest way would be a script in cron

    #!/bin/bash ps -ef | grep -v grep | grep Your-Prog if [ $? -eq 1 ] then restart your program fi

  • camargcamarg Member

    from my blog here akamaras.com/linux/linux-script-to-check-if-a-service-is-running-and-start-it-if-its-stopped/

    #!/bin/bash
    
    ###edit the following
    service=service_name
    [email protected]
    ###stop editing
    
    host=`hostname -f`
    if (( $(ps -ef | grep -v grep | grep $service | wc -l) > 0 ))
    then
    echo "$service is running"
    else
    /etc/init.d/$service start
    if (( $(ps -ef | grep -v grep | grep $service | wc -l) > 0 ))
    then
    subject="$service at $host has been started"
    echo "$service at $host wasn't running and has been started" | mail -s "$subject" $email
    else
    subject="$service at $host is not running"
    echo "$service at $host is stopped and cannot be started!!!" | mail -s "$subject" $email
    fi
    fi
    
  • Go59954Go59954 Member
    edited May 2012

    @VPSCheap_net said: the simplest way would be a script in cron

    !/bin/bash ps -ef | grep -v grep | grep Your-Prog if [ $? -eq 1 ] then restart your program fi

    Thank you! Going to test that, I guess cron is the way to go to make it simple.

    @camarg said: from my blog here akamaras.com/linux/linux-script-to-check-if-a-service-is-running-and-start-it-if-its-stopped/

    Thank you for that great script. I will test it for sometime to figure how it goes under the real condition, once issues repeats.

  • raindog308raindog308 Administrator, Veteran

    Good Lord, there's a lot of wheel reinvention here.

    The Unix way(*) to start a process if it fails is to use inittab, which some distros have retired in favor of upstart. 'respawn' is the configuration you want. This has been in Unix since at least the early 90s.

    (*) at least for predominantly SysV-derived Unices like Linux. I don't know what the equivalent is in BSD off hand.

    Thanked by 2marrco MrDOS
  • Take a look at my cron/screen based services restarter/starter. I use it and it works well :)

    https://github.com/maxexcloo/User-Daemon

  • roytam1roytam1 Member

    There is some worst case: service(for example lighttpd) process is here but sits and does nothing (service stalls)

    my freebsd cron script:

    #!/bin/sh
    fetch -o /dev/null -T 3 http://localhost/echo.php > /dev/null 2>&1
    if [ $? -gt 0 ]; then
            /usr/local/etc/rc.d/lighttpd restart
            /usr/local/etc/rc.d/php-fpm restart
    fi
    
  • subigosubigo Member
    edited May 2012

    @raindog308 said: Good Lord, there's a lot of wheel reinvention here.

    The Unix way(*) to start a process if it fails is to use inittab, which some distros have retired in favor of upstart. 'respawn' is the configuration you want. This has been in Unix since at least the early 90s.

    (*) at least for predominantly SysV-derived Unices like Linux. I don't know what the equivalent is in BSD off hand.

    Respawning a web service like Apache (and depending on how you have Apache setup it won't even work) or MySQL through init is a pain in the ass and not really the point of init. A simple cron script is a lot easier to manage, especially when you need to permanently stop the service for a while.

  • prometeusprometeus Member, Host Rep

    You can also use the daemontools
    http://cr.yp.to/daemontools.html

  • Go59954Go59954 Member
    edited May 2012

    @raindog308 said: Good Lord, there's a lot of wheel reinvention here.

    The Unix way() to start a process if it fails is to use inittab, which some distros have retired in favor of upstart. 'respawn' is the configuration you want. This has been in Unix since at least the early 90s.
    (
    ) at least for predominantly SysV-derived Unices like Linux. I don't know what the equivalent is in BSD off hand.

    Right, probably I should've searched for any available/similar scripts beforehand, but I was also short on time recently.
    As for service starting commands that's why I've suggested @NickM to include a config file preloaded with a list of most used services, and next to each one is a space to add starting command by the user himself and to only desired services that he wants to be started if failed, and so commands are left to the user to add depending on distro and version of programs.

  • joepie91joepie91 Member, Patron Provider

    Have a look at http://puppetlabs.com/.

  • marrcomarrco Member

    @dannix said: If you need monitoring and restarting check monit

    that

    Thanked by 1Go59954
  • Why use difficult scripts to check if a service is running? Just put "/etc/init.d/service start" in your crontab, if the service is already running nothing will happen.

  • Go59954Go59954 Member

    @maxexcloo said: Take a look at my cron/screen based services restarter/starter. I use it and it works well :)

    https://github.com/maxexcloo/User-Daemon

    Thanks. looks good, and I'm giving it a try ;)

    @roytam1 said: There is some worst case: service(for example lighttpd) process is here but sits and does nothing (service stalls)

    my freebsd cron script:

    It's a good point to add in a new script as well, thank you.

    @subigo said: Respawning a web service like Apache (and depending on how you have Apache setup it won't even work) or MySQL through init is a pain in the ass and not really the point of init. A simple cron script is a lot easier to manage, especially when you need to permanently stop the service for a while.

    Thanks ;)

    @prometeus said: You can also use the daemontools

    http://cr.yp.to/daemontools.html
    Might eventually

    What a great solution. thanks for posting that!

    Thanks. And that's full featured.

    @gsrdgrdghd said: Why use difficult scripts to check if a service is running? Just put "/etc/init.d/service start" in your crontab, if the service is already running nothing will happen.

    Thanks! A good suggestion.Even though it won't notify by email.

  • WilliamWilliam Member
    edited May 2012

    Just as a hint.... why not use Monit?
    http://mmonit.com/monit/

    Runs as service and can easily be adapted to every app that has a PID file - Checks if the app works etc etc.

    Example config for nginx:

    #check nginx now check process nginx with pidfile /var/run/nginx.pid start program = "/etc/init.d/nginx start" stop program = "/etc/init.d/nginx stop" if failed host IP.IP.IP.IP port 80 protocol HTTP request / then restart if 100 restarts within 100 cycles then timeout

    or php:

    check process php-fpm with pidfile /var/run/php5-fpm.pid group phpcgi # phpcgi group start program = "/etc/init.d/php5-fpm start" stop program = "/etc/init.d/php5-fpm stop" ## Test the UNIX socket. Restart if down. if failed unixsocket /tmp/php-cgi.sock then restart ## If the restarts attempts fail then alert. if 100 restarts within 100 cycles then timeout

    Can send email, sms and has a nice webinterface where the status can be seen and the process can be restarted manually - Multiple servers can be grouped in one interface by MMonit.

  • ipxadamipxadam Member

    We use Hyperic at my day job (monitoring a major US retail website). Its pretty powerful and there is an open source version: hyperic.com

  • The best way to do this is to have a check script spawned by crown every so often, it won't take up memory whilst it's executing. Crond rarely fails.

Sign In or Register to comment.