Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Help! Massive amount of zombie processes on my server?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Help! Massive amount of zombie processes on my server?

djvdorpdjvdorp Member
edited December 2011 in General

Just logged in onto one of my servers, found this:

Processes: 3364
Users logged in: 0
Memory usage: 7%
Swap usage: 0%

=> There are 3220 zombie processes.

free -m states:
total used free shared buffers cached
Mem: 7994 1188 6805 0 65 332
-/+ buffers/cache: 790 7203
Swap: 2046 0 2046

whats up here, and what can i do to make this right?
running a php5-fpm nginx stack only there hosting a big website

«1

Comments

  • What are the zombie processes? You can see this with 'ps aux', and then looking at the STAT column, which is the one to the left of the START column.

    If they're forked from another process, you can try killing the parent process.

    Here's a plethora of information I found with a little searching on google: http://linuxshellaccount.blogspot.com/2008/05/killing-zombie-processes-in-linux-and.html

  • They are all like this:
    USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
    root 2822 0.0 0.0 0 0 ? Z 14:43 0:00 [fus]
    root 2823 0.0 0.0 0 0 ? Z 14:43 0:00 [fus]
    root 2824 0.0 0.0 0 0 ? Z 14:43 0:00 [fus]
    root 2825 0.0 0.0 0 0 ? Z 14:43 0:00 [fus]
    root 2826 0.0 0.0 0 0 ? Z 14:43 0:00 [fus]
    root 2827 0.0 0.0 0 0 ? Z 14:43 0:00 [fus]
    root 2828 0.0 0.0 0 0 ? Z 14:43 0:00 [fus]
    root 2829 0.0 0.0 0 0 ? Z 14:43 0:00 [fus]

    I think it has something to do with my php5-fpm config, see this php5-fpm log:
    [16-Dec-2011 13:15:01] NOTICE: fpm is running, pid 1134
    [16-Dec-2011 13:15:01] NOTICE: ready to handle connections
    [16-Dec-2011 13:15:05] WARNING: [pool www] seems busy (you may need to increase start_server$
    [16-Dec-2011 13:15:06] WARNING: [pool www] seems busy (you may need to increase start_server$
    [16-Dec-2011 13:15:07] WARNING: [pool www] seems busy (you may need to increase start_server$
    [16-Dec-2011 13:15:08] WARNING: [pool www] seems busy (you may need to increase start_server$
    [16-Dec-2011 13:15:09] WARNING: [pool www] server reached max_children setting (60), conside$
    [16-Dec-2011 13:18:18] WARNING: [pool www] child 1858 exited on signal 11 (SIGSEGV) after 24$
    [16-Dec-2011 13:18:18] NOTICE: [pool www] child 1865 started
    [16-Dec-2011 13:26:18] WARNING: [pool www] child 1995 exited on signal 11 (SIGSEGV) after 33$
    [16-Dec-2011 13:26:18] NOTICE: [pool www] child 2006 started
    [16-Dec-2011 13:26:25] WARNING: [pool www] child 1994 exited on signal 11 (SIGSEGV) after 41$
    [16-Dec-2011 13:26:25] NOTICE: [pool www] child 2013 started
    [16-Dec-2011 13:41:05] ERROR: fork() failed: Cannot allocate memory (12)
    [16-Dec-2011 13:49:49] ERROR: fork() failed: Cannot allocate memory (12)
    [16-Dec-2011 13:50:52] ERROR: fork() failed: Cannot allocate memory (12)
    [16-Dec-2011 13:51:06] ERROR: fork() failed: Cannot allocate memory (12)
    [16-Dec-2011 13:55:15] ERROR: fork() failed: Cannot allocate memory (12)
    [16-Dec-2011 14:02:10] ERROR: fork() failed: Cannot allocate memory (12)
    [16-Dec-2011 14:12:45] ERROR: fork() failed: Cannot allocate memory (12)
    [16-Dec-2011 14:21:11] ERROR: fork() failed: Cannot allocate memory (12)
    [16-Dec-2011 14:29:09] ERROR: fork() failed: Cannot allocate memory (12)
    [16-Dec-2011 14:41:35] ERROR: fork() failed: Cannot allocate memory (12)

  • SpeedBusSpeedBus Member, Host Rep

    Reboot Server !!! It's The best solution :D

  • @SpeedBus said: Reboot Server !!! It's The best solution :D

    Is that serious? Already did that today :(

  • You need to increase the number of available php processes to handle the number of concurrent connections. Looking at that log, 4 seconds after php is available, you are reaching your limit. Also it's clear that you're hitting your mem limits. If this is OpenVZ, check beancounters to see which limit is being hit. If there are non zero failcnts, you may need more system memory. If none, it might be some internally php configured memory limit. Check your php config files for max memory allocation and bump that up.

  • SpeedBusSpeedBus Member, Host Rep

    @djvdorp : Hmm.. Used to work for me :P but yeah, I feel that what rajprakash said is true, you need to increase the memory allocated to php and the number of processes, maby the connections limit in Apache or anyother webserver you use.

  • @rajprakash:
    its a dedi i7 server with 8gb dedicated ram.
    not really a leb but didnt know where else to get help
    php memory limit is 256mb with max 40 worker processes i think
    running on nginx webserver

  • edited December 2011

    According to your logs, you spawned php's max_children number of processes in 8 seconds. PHP was ready to serve at 13:15:01, and at 13:15:09 you reached max_children settings. Unless you're getting an insane amount of concurrent traffic, sounds like you're getting some kind of attack.

  • @djvdorp said: php memory limit is 256mb with max 40 worker processes i think

    Are you running php-fpm? If so, there's no way you should have 40 php-cgi processes, that's nuts.

  • @kairus
    yes i am using php fpm

    will post complete config tomorrow when i get home.

    thanks for all your time, appreciate it much

  • LOL 40 workers... Maybe you have 30 cpu cores or more :S

  • KairusKairus Member
    edited December 2011

    [16-Dec-2011 13:15:09] WARNING: [pool www] server reached max_children setting (60), conside$

    60 workers actually :P

  • ¬_¬

    Useless

    No more than your # of cores. Maybe one or two more, more than that, completely waste. Even more slow.

  • Hi all.

    Here some more info:
    the dedi runs on an i7 960 cpu, with 4 cores and 8 threads total.
    http://ark.intel.com/products/37151/Intel-Core-i7-960-Processor-(8M-Cache-3_20-GHz-4_80-GTs-Intel-QPI)

    i have 8gb dedicated memory.

    nginx.conf:
    worker_processes 1;
    worker_connections 1024;

    php5-pfm [my pool]:
    pm = dynamic
    pm.max_children = 60
    pm.start_servers = 20
    pm.min_spare_servers = 20
    pm.max_spare_servers = 20
    pm.max_requests = 5000

    php.ini from fpm:
    memory_limit = 256M

    Hope this is all you need, and again; thanks for all of your time and replies!

    PS:

    Processes: 30801
    => There are 30662 zombie processes.

    It's getting worse :(

    free -m though:
    total used free shared buffers cached
    Mem: 7994 4068 3925 0 124 3074
    -/+ buffers/cache: 869 7124
    Swap: 2046 0 2046

  • KairusKairus Member
    edited December 2011

    In nginx.conf change worker_processes to 4 (always equal to # of cores, I'd ignore the hyperthreading cores, as you don't need that many processes). Drop worker_connections to 256. Max connections in nginx is worker_processes*worker_connections. You can of course up worker_connections if you want to have more than 1024 open connections, i.e. you have a high keep alive (something else you should tweak).

    I also throw use epoll; under connections {} in nginx.conf, I'm not sure if it automatically uses epoll under *nix, I remember under FreeBSD you had to specify it to use kqueue.

    In fpm conf, change pm.max_children to 8, pm.start_servers to 4, pm.min_spare_servers to 2, pm.max_spare_servers to 4. You can probably up pm.max_requests as well, but I'd just leave it now. You'll want to monitor this, and possibly increase the max children, but not by much...

    Are you using an opcode cacher btw?

    Restart your server if possible after this, just to clear all those processes.

  • @Kairus said: Drop worker_connections to 256. Max connections in nginx is worker_processes*worker_connections. You can of course up worker_connections if you want to have more than 1024 open connections, i.e. you have a high keep alive (something else you should tweak).

    I never set worker_connections that low, as it makes it too easy for slowloris to take you out.

  • @dmmcintyre3 said: I never set worker_connections that low, as it makes it too easy for slowloris to take you out.

    If you set realistic timeouts and keep-alive, it shouldn't be a problem. Otherwise you might take on more connections than you can handle, but it does depend on the web site, and the OP hasn't said what his site is hosting.

  • djvdorpdjvdorp Member
    edited December 2011

    Thanks for your suggestions, will tweak this when I am able to reboot the server (night over here probably).

    How about the PHP memory limit? Isn't 256mb any problem (is it low?)
    Or doesn't that have anything to do with my problems?

    And isn't there any reason why you'd want the pm.max_children and pm.max_servers as high as they are currently at my system (got this info from another php5-fpm power user).

    The website running is using Wordpress, wordpress->vbulletin bridge and a vbulletin install with almost 500.000 visits/new users each day. (Samsung community website: www.sammobile.com , new site for www.samfirmware.com)

  • KairusKairus Member
    edited December 2011

    The memory_limit variable sets how much a script can utilize, so I believe it would be per process. It's not an overall limit, it's per script execution.

    I can't see a reason to have pm.max_children THAT high, it's just too many. You need to monitor it though, it depends on how many concurrent connections you have, and how long it takes your scripts to execute (incl. time for MySQL requests), so the values I listed above may be too low for your needs, but the settings you have now are way too high.

    You do run APC or xCache right?

  • @Kairus said: You do run APC or xCache right?

    I don't have any caching in place currently AFAIK?
    Never had experience with that :(

  • @djvdorp said: I don't have any caching in place currently AFAIK?

    Never had experience with that :(

    You definitely should! You'll see a huge drop in load. Check out http://pecl.php.net/package/APC

  • @Kairus said: You definitely should! You'll see a huge drop in load. Check out http://pecl.php.net/package/APC

    I might just be plain stupid, but I can't seem to find out there how to install it into my current config?

  • KairusKairus Member
    edited December 2011

    @djvdorp said: I might just be plain stupid, but I can't seem to find out there how to install it into my current config?

    There's a few different ways, you can install it through your distro's repositories (called php-apc, php5-apc, or apc). You can also install it through pecl/pear.

    I always compile it from source, it's simple, unpack the tar, run ./phpize, then ./configure and compile as usual. Then simply enable it in your php.ini like any php plugin, the default settings to enable it are:

      extension=apc.so
      apc.enabled=1
      apc.shm_size=128M
      apc.ttl=7200
      apc.user_ttl=7200
    

    You'll want to tweak them obviously, the INSTALL document included within the source describes all the settings in detail, and things like apc.stat can increase performance even more, but has its downsides, so I recommend reading the documentation. apc.php also included with the source shows a bunch of graphs & stats that will help you tweak your settings even more.

    vBulletin can also make use of APC's variable cache, I would suggest running a search on their forums on how to do this, I haven't used vB since 3.7, so I'm not sure how to enable it in the latest versions.

    Thanked by 1djvdorp
  • @Kairus said: There's a few different ways, you can install it through your distro's repositories (called php-apc, php5-apc, or apc). You can also install it through pecl/pear.

    Just found out that I did already had php apc installed!!
    Ran this:
    dpkg -l | grep php-apc

    stupid me... forgot I already did that.
    also, I did make the config changes suggested you suggested @Kairus,
    but I still have like 15200 zombie processes... maybe I should upgrade all things like NGINX and php5-fpm to the latest version, in case it's a package bug?

  • @djvdorp I'm not sure what could cause that to happen, maybe someone else can chime in?

    It probably wouldn't hurt to update everything to the latest, so give that a try.

    Thanked by 1djvdorp
  • FranciscoFrancisco Top Host, Host Rep, Veteran

    just load shotgun.so into PHP, problem solved.

    Thanked by 1djvdorp
  • @Kairus said: It probably wouldn't hurt to update everything to the latest, so give that a try.

    Sorry to be such an scared guy, but updating Nginx and php-fpm won't suddenly break my production website, right?

  • @djvdorp said: Sorry to be such an scared guy, but updating Nginx and php-fpm won't suddenly break my production website, right?

    Did you install it through a package manager? If so, there should be no problem, you should always keep things up to date, otherwise you risk leaving security vulnerabilities unfixed.

  • @Kairus said: Did you install it through a package manager? If so, there should be no problem, you should always keep things up to date, otherwise you risk leaving security vulnerabilities unfixed.

    Yes, just clean apt-get install's from ubuntu's repo. Dunno if it was unstable or stable for sure though :)

  • KairusKairus Member
    edited January 2012

    @djvdorp said: Yes, just clean apt-get install's from ubuntu's repo. Dunno if it was unstable or stable for sure though :)

    Run an apt-get update, and apt-get upgrade :)

Sign In or Register to comment.