A question about load balanced round robin

jvnadr · April 2014

Is it possible to setup a round robin balancing infrastructure for streaming server with specific load to each server?

Example: I have 5 servers witch I want to share load, with the same content. So, I set up round robin to automatically chose the next server, in every request.

Is it possible to have a way to meter the active users of each server, so, if this limit is reached, round robin dns goes to another server?

E.g., every server has 100 active/concurrent streaming video viewers capacity limit. RR pushes clients to server 1, next to 2, next to 3 and on, and then when to server 5, next request goes to 1 again and so long. In this example, there is no way to determine if a server has reached its capacity, because some of the users are actively watching and some others, not. So, if we say that server 2 has 100 active viewers and server 3 only 60 (because 40 has stopped viewing the stream), RR will send after server 1 next user to server 2, there will be no more bandwidth to serve and the system will hang same time that next server has a lot of capacity left over...

Any help/suggestions/tutorials?

Thanks!

udk · April 2014

I think your best bet is not using DNS round robin at all and doing it all the application level. A simple script to determine resource usage and redirect users to available server should suffice. This is assuming your clients can handle redirects...

jvnadr · April 2014

@udk I would prefer doing that at dns level. I want to embed a player to several websites thru a single http code that can be changed by me without having to be able to access those websites (they are not belong to me, they will just use my stream).
So they will embed a html rather a stream directly to use with RR.
If I do that with redirect, then, it is possible to rise some issues because the redirected page will be inside an iframe or wrapped page inside a cms...

udk · April 2014

Will be much harder to do with DNS alone, even using tiny TTLs. You could try removing from the round robin once it nears capacity but there's no guarantee some clients won't still see it as not everyone respects TTL.

I suppose it all depends on how much spare capacity you have. If you run at max 50% load across all servers then you can take a server out easily enough before it hits the limit, but 90% will be harder.

jvnadr · April 2014

@udk Thanks for the advice, 60% was what I had in mind for server limit, to manage the TTL. But I don't know how to make RR count active sessions, so stop sending traffic to a server when it reached to its capacity.

ihatetonyy · April 2014

jvnadr said: so stop sending traffic to a server when it reached to its capacity.

Webhooks with Rage4 will do this, but DNS caching on client or ISP-side may work against you in this sort of situation as stated above.

Your streaming server may also be able to redirect clients. I know nginx-rtmp-module can do this, but I am not sure about Adobe FMS or Wowza.

daxterfellowes · April 2014

I'm curious, does anyone have guides for this that they would suggest?

TheLinuxBug · April 2014

You would want to do this with an application that designed for load balancing like Haproxy

Some howtos/docs for HAProxy:

1. https://www.digitalocean.com/community/articles/how-to-use-haproxy-to-set-up-http-load-balancing-on-an-ubuntu-vps

2. http://www.severalnines.com/resources/clustercontrol-mysql-haproxy-load-balancing-tutorial

3. http://docs.basho.com/riak/latest/ops/advanced/configs/load-balancing-proxy/

4. http://wiki.joyent.com/wiki/display/jpc2/Load+Balancing+with+HAproxy

5. http://www.howtoforge.com/forums/archive/index.php/t-34573.html

6. http://www.howtoforge.com/setting-up-a-high-availability-load-balancer-with-haproxy-wackamole-spread-on-debian-etch

Haproxy allows you to weight each server as well as specify max connections to each backend server.

In all my experience I have never found a way to do this with DNS, there is no way the client would know to switch servers, it would just start failing to be able to connect. Best case scenario, if you were able to disable the server to new users after x # of connections, you would still run into a situation where it would have to fail to load it from server A before failing over to server B. This would cause the client to actually completely fail to load the page before they would be redirected.

Another option software wise, but not as in depth and a bit different to configure, would be to use varnish to load balance. However, the main reason you would use varnish would be that you would want it to CACHE data on its self and then forward users to the content it does not have cached on the outside servers. Most distributions come with a version of varnish (and likely haproxy) in their repository. (apt-get install varnish; yum install varnish).

Some howtos/docs for Varnish:

1. https://www.varnish-cache.org/docs/2.1/faq/http.html

2. https://www.varnish-cache.org/forum/topic/383

3. http://blog.linuxacademy.com/linux/how-to-clear-varnish-cache/

4. http://serverfault.com/questions/442181/load-balancer-with-varnish-round-robin

5. http://stackoverflow.com/questions/7335406/varnish-round-robin-director-with-backend-virtual-hosts

6. https://www.varnish-cache.org/trac/wiki/LoadBalancing

7. http://mesmor.com/2012/02/15/varnish-client-director-with-sticky-session/

8. http://mclear.co.uk/2011/02/27/updated-varnish-wordpress-vcl/

9. https://www.varnish-cache.org/forum/topic/120

Note that if you use HAProxy or Varnish in front of your Apache server, you will need to use mod_rpaf to get the correct ip in your server logs if you are someone who cares about logging:

Some links:

1. http://www.rootdamnit.eu/2012/09/24/varnish-and-apache-rpaf-invalid-command-rpafheader/

2. http://stackoverflow.com/questions/10024877/varnish-client-ip-not-logging-in-apache-logs

3. http://willjackson.org/blog/configure-vanish-forward-client-ip-addresses-apache-logs

4. http://stderr.net/apache/rpaf/

5. http://www.andrewboring.com/technotes/client-ip-x-forwarded-across-multiple-proxies

Hope this helps.

Cheers!

winnervps · April 2014

In windows, u could search webfarm or serverfarm

In linux, there are linux virtual server, haproxy, redhat load balancer.

None of the dns trick could do such algorithm, it is on the server (dns++) side.

ihatetonyy · April 2014

daxterfellowes said: I'm curious, does anyone have guides for this that they would suggest?

No guide, but the basic gist (for streaming, anyway.. and presuming you're using nginx-rtmp-module for your server) would be something along the lines of:

proceed as normal: give an embed with the roundrobin as host
round robin with Rage4 or another provider that gives generic webhook support
to kill the server in the RR after a certain amount of viewers:
for RTMP, use on_play to count connecting clients (with a PHP script, or something of your choosing) and then hit the "kill this server" webhook at Rage4 when your limit is reached
for HLS, use some sort of magic to parse the logs for people who are pulling the .ts chunks
to revive it once people leave:
for RTMP, use on_play_done to decrease the number of stored connected clients, then bring the server back to life with Rage4
for HLS, keep parsing and then bring it back to life

Very unscientific analysis because I've never done it. You can also combine these methods with in-server redirects, as linked above, to foolproof even in the case of DNS caching.

daxterfellowes · April 2014

@TheLinuxBug said:

@ihatetonyy said:

>

Thank you both! I'm going to see if I can incorporate this into the application that I'm building.

seaeagle · April 2014

@udk said:
Will be much harder to do with DNS alone, even using tiny TTLs. You could try removing from the round robin once it nears capacity but there's no guarantee some clients won't still see it as not everyone respects TTL.

you dont need/use tiny TTL. when you add multiple A records for the same dns name they get handed out in order. record 1, record 2, record 3, record1, record2, record3 etc. if you are dealing with 300+ concurrent connections across 5 hosts you are going to get a pretty good averaging effect but no guarantees. the bigger the number of connections the more likely it is to work...
who doesnt respect TTL? in what circumstance would you ever configure a host to ignore TTL?

Maounique · April 2014

seaeagle said: if you are dealing with 300+ concurrent connections across 5 hosts you are going to get a pretty good averaging effect but no guarantees. the bigger the number of connections the more likely it is to work...

The law of big numbers. If you see it starts failing, add more servers instead of taking offline the full ones.

daxterfellowes · April 2014

@TheLinuxBug said:
You would want to do this with an application that designed for load balancing like Haproxy

Some howtos/docs for HAProxy:

1. https://www.digitalocean.com/community/articles/how-to-use-haproxy-to-set-up-http-load-balancing-on-an-ubuntu-vps

I used this for the most half with some EC2 lamp setups. Worked great and was easily able to get a test running quickly between the three servers.

Just had a thought about SSL certificates and load balancing. Looks like SSL termination is required, so I'll need to do more reading into that.

Howdy, Stranger!

Categories

In this Discussion

A question about load balanced round robin

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

A question about load balanced round robin

Comments