Tuning high-raffic nginx and wordpress server

Question

I have been conducting load-tests (via blitz.io) as I attempt to tune server performance on a pool of servers running php 5.5, wordpress 3.9.1, and nginx 1.6.2.

My confusion arises when I overload a single server with too much traffic. I fully realize that there are finite resources on a server and at some level it will have to begin rejecting connections and/or returning 502 (or similar) responses. What's confusing me though, is why my server appears to be returning 502s so early within a load test.

I have attempted to tune nginx to accept several connections:

nginx.conf

worker_processes auto;
worker_rlimit_nofile 100000;

events {
    worker_connections 1024;
    use epoll;
    multi_accept on;
}

site.conf

location ~ .php$ {
      try_files $uri =404;
      include /etc/nginx/fastcgi_params;
      fastcgi_pass unix:/var/run/php5-fpm.sock;
      fastcgi_index index.php;
      fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
      fastcgi_read_timeout 60s;
      fastcgi_send_timeout 60s;
      fastcgi_next_upstream_timeout 0;
      fastcgi_connect_timeout 60s;
   }

php www.conf

pm = static
pm.max_children = 8

I expect the load test to saturate the PHP workers rather quickly. But I also expect nginx to continue accepting connections and after the fast_cgi timeouts are hit, begin returning some sort of HTTP error code.

What I'm actually seeing is nginx returning 502s almost immediately after the test is launched.

nginx error.log

2014/11/01 20:35:24 [error] 16688#0: *25837 connect() to unix:/var/run/php5-fpm.sock failed 
(11: Resource temporarily unavailable) while connecting to upstream, client: OBFUSCATED, 
server: OBFUSCATED, request: "GET /?bust=1 HTTP/1.1", upstream: 
"fastcgi://unix:/var/run/php5-fpm.sock:", host: "OBFUSCATED"

What am I missing? Why aren't the pending requests being queued up, and then either completing or timing out later in the process?

Justin L. Franks · Answer

Your problem is most likely with your PHP-FPM configuration, because you are using the static process manager with only 8 child processes. Just about any load testing will use up those 8 child processes instantly and beg for more -- when there isn't an idle child process to process the PHP code, you'll get the 502 errors you are seeing.

You should switch to the dynamic, or even better (in my opinion), ondemand.

Also, set your max_children fairly high, depending on what kinds of load tests you are running. Without knowing the details of the tests you are running I can't suggest any values for max_children. In my case, where I have several sites which as a whole get ~2,500 unique visitors and ~15,000 pageviews daily, my max_children is set to 64 and it never gets even close to that number. I set it higher than I need because load testing has indicated that my server can handle quite a bit more traffic than it is currently getting.

Once you get the load tests running well you'll have a better idea of how to tune your PHP-FPM configuration. I'd say set max_children to 64 like I do; just check the PHP-FPM log to see if you are banging up against that limit and adjust upwards as needed.

Xavier Lucas · Answer

This means the php part crashed and isn't listening anymore on the unix socket.

So nginx won't queue anything as it just can't contact the proxied server to send request to, and at this point, you can easily imagine that requests get processed very fast on nginx's side.

If your php server didn't crash, requests would indeed standby regarding fastcgi_connect_timeout and fastcgi_read_timeout values, waiting for some event to show up. If these timeouts were reached you should see 504 error codes.

Your worker_connections seems a bit low by the way compared to rlimit.

It may also be time to start using an upstream block to decide how nginx must behave when target servers seems down, using health checks. With this you can manage how long is the delay that, once reached, will mark a server as down. Once considered down, requests won't reach him until the healthcheck condition to mark him up again passes.

Tuning high-raffic nginx and wordpress server

2 Answers

Add your own answers!

Ask a Question