The fair load balancer module for Nginx (upstream_fair)

What?

upstream_fair is a load balancer module for the fantastic Nginx web server. It implements somewhat smarter logic than the built in pure round-robin load balancer and may be better suited to diverse workloads (a mix of fast and slow pages) than the stock balancer.

Why?

Smarter load balancing

The main feature of upstream_fair is that it knows how many requests each backend is processing (a backend is simply one of the servers, among which the load balancer has to make its choice). Thus it can make a more informed scheduling decision and avoid sending further requests to already busy backends.

Statistics

Another neat feature is the built-in status page (requires my StubStatus? hook patches), which can tell you:

  • how may requests have been proxied
  • what the distribution between backends was
  • what the current workload is (per-backend)

Special load balancer for special needs

upstream_fair has several modes of operation, making it suitable for diverse environments.

default

The default mode is a simple WLC-RR (weighted least-connection round-robin) algorithm with a caveat that the weighted part isn't actually too fair under low load (under high load it all averages out, anyway). This is the upstream_fair many of you already know. Other modes are the result of recent development so grab a copy before your competition does ;)

no_rr

If you wish, you may disable the "-RR" part, which means that whenever the first backend is idle, it's going to get the next request. If it's busy, the request will go to the second backend unless it's busy too etc.

Why would you want to disable round-robin? A particularly good reason is when you're still unsure about how many backends you need and are starting the backends on demand (e.g. using my Spawner?). With round robin enabled, the requests will get distributed roughly equally between backends, so all backends will have to run all the time (even if you actually use 10% of their capacity). When you disable round-robin, you are going to use exactly as many backends as you really need.

weight_mode=idle no_rr

However, by default an "idle" backend (a rather central concept in upstream_fair) is exactly that: a backend with zero requests being processed. Thus two concurrent requests will cause two backends to start up even if one would easily handle it. Enter weight_mode=idle.

This mode redefines the meaning of "idle". It now means "less than weight concurrent requests". So you can easily benchmark your backends and determine that X concurrent requests is the maximum for you (e.g. while keeping latency below a limit or maximising throughput), set the weight to that amount and that's it. upstream_fair will balance between the minimum possible pool of backends, adding new ones as the load increases. Although the backends are all considered "idle" by the main algorithm, they are still scheduled using the least-connection algorithm (without the weighted part).

weight_mode=peak

On the opposite end of the scale, you may find out that your backends cannot keep up with the load and you'd rather return 50x errors to the client than try to process too many requests (you might e.g. have a funky tiered load-balancing setup or try to keep latency under control).

Simply enable weight-mode=peak and be sure that Nginx will never send more than weight requests to any single backend. If all backends are full, you will start receiving 502 errors.

Where?

You may browse the code (and download a tarball) on github:

Github: http://github.com/gnosek/nginx-upstream-fair/tree/master

upstream_fair is also documented on the Nginx wiki:

Nginx wiki: http://wiki.codemongers.com/NginxHttpUpstreamFairModule

How?

Download

tarball
http://github.com/gnosek/nginx-upstream-fair/tarball/master
git repo
git://github.com/gnosek/nginx-upstream-fair.git

Install

Add the following option to your Nginx ./configure command:

--add-module=path/to/upstream_fair/directory

Then "make" and "make install" as usual.

Configure

To enable the fair balancer, simply add 'fair' to the upstream block, like this:

upstream backend {
    server server1;
    server server2;
    fair;
}

The 'fair' directive accepts the parameters 'no-rr', 'weight-mode=idle' and 'weight-mode=peak' described above.

Anything else?

Why all these modes? The syntax is ugly!

Yep. I know. However at the moment load balancer modules cannot define their own parameters to the server directive (e.g. server 1.2.3.4 idle=4 peak=20), so we have to live with what we've got (fair weight-mode=idle; server 1.2.3.4 weight=4).

Performance

upstream_fair shouldn't impact your throughput significantly. Below you'll find a totally meaningless benchmark, comparing the stock load balancer and upstream_fair in some synthetic conditions.

The upstream section from config file:

upstream testing {
        # fair;
        server 127.0.0.1:81 max_fails=3 weight=2;
        server 127.0.0.1:81 max_fails=3 weight=2;
        server 127.0.0.1:81 max_fails=3 weight=2;
        server 127.0.0.1:81 max_fails=3 weight=2;
        server 127.0.0.1:81 max_fails=3 weight=2;
        server 127.0.0.1:81 max_fails=3 weight=2;
}

Port 81 is used by a Lighttpd instance serving whatever it does by default in Ubuntu (a simple static page). Both Nginx and Lighttpd serve about 15000 requests per second without proxying.

NOTE: I have no idea about the extreme peak latency. This happens regardless of the load balancer or actually proxying at all. Even serving static content or the status page seems affected.

I used Nginx 0.6.31 for testing.

default load balancer

Document Path:          /
Document Length:        3585 bytes

Concurrency Level:      500
Time taken for tests:   9.172814 seconds
Complete requests:      50000
Failed requests:        0
Write errors:           0
Total transferred:      192084569 bytes
HTML transferred:       179282265 bytes
Requests per second:    5450.89 [#/sec] (mean)
Time per request:       91.728 [ms] (mean)
Time per request:       0.183 [ms] (mean, across all concurrent requests)
Transfer rate:          20449.78 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   31 294.4      2    3012
Processing:     3   43 284.0     16    3043
Waiting:        2   41 283.9     14    3042
Total:          7   75 463.5     17    6045

Percentage of the requests served within a certain time (ms)
  50%     17
  66%     19
  75%     20
  80%     21
  90%     24
  95%     31
  98%     88
  99%   3024
 100%   6045 (longest request)

upstream_fair

Document Path:          /
Document Length:        3585 bytes

Concurrency Level:      500
Time taken for tests:   9.289024 seconds
Complete requests:      50000
Failed requests:        0
Write errors:           0
Total transferred:      192073046 bytes
HTML transferred:       179271510 bytes
Requests per second:    5382.70 [#/sec] (mean)
Time per request:       92.890 [ms] (mean)
Time per request:       0.186 [ms] (mean, across all concurrent requests)
Transfer rate:          20192.76 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   35 349.8      2    8999
Processing:     1   35 235.8     16    3034
Waiting:        0   33 235.8     14    3030
Total:          8   71 451.2     17    9019

Percentage of the requests served within a certain time (ms)
  50%     17
  66%     19
  75%     21
  80%     22
  90%     27
  95%     32
  98%     90
  99%   3020
 100%   9019 (longest request)

Algorithm and internals

TODO (sched_score, shared memory etc.)

Sites using upstream_fair

Feel free to add your site here!

  • http://you?