Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Round Robin DNS != High Availability. Or am I wrong?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Round Robin DNS != High Availability. Or am I wrong?

raindog308raindog308 Administrator, Veteran
edited April 2012 in General

I constantly see people who discuss their "high availability" setups like this:

  • round robin DNS
  • round robin DNS with a low TTL and some kind of automated DNS change

The first one is pure confusion. Clients do not "try one and if it fails, lookup again". They lookup, get an address, and assume that's the address. Round robin DNS is fine for load balancing, but not for HA. If you have two IPs in a RRDNS and one goes down, 50% of your clients will continue to hit the down server (unless you have some custom client code, but in this case I'm thinking of web browsers).

Even with some low TTL/automated DNS change, it's still weak. There is no guarantee that any nameserver is going to honor your 60-second TTL - I've read some of the big ones ignore anything less than an hour. Second, you're assuming my browser or client will not cache things or that it's cache is short. One example: Internet Explorer caches for 30 minutes by default. FasterFox caches for 1 hour. Etc. And finally, my browser has no idea it's got a "round robin" address - it has no idea that it should check again if the first one doesn't work.

Granted, I think the DNS standard could implement some kind of extension that tags lookups as "there are other addresses you could use". But it doesn't.

So. Am I wrong?

Comments

  • NickWNickW Member
    edited April 2012

    Yes, you're wrong (partially).

    Round robin DNS is not HA, but it can be useful. Think of it more of a poor man's load distribution. It is also the simplest and most fundamental way of doing so. Look up the records for any major website, almost all of them have round robin as it's free and not entirely useless.

    # dig www.google.com
    
    ; <<>> DiG 9.7.3-P3-RedHat-9.7.3-8.P3.el6_2.2 <<>> www.google.com
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11085
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 4, ADDITIONAL: 0
    
    ;; QUESTION SECTION:
    ;www.google.com.                        IN      A
    
    ;; ANSWER SECTION:
    www.google.com.         328010  IN      CNAME   www.l.google.com.
    www.l.google.com.       83      IN      A       173.194.34.17
    www.l.google.com.       83      IN      A       173.194.34.18
    www.l.google.com.       83      IN      A       173.194.34.19
    www.l.google.com.       83      IN      A       173.194.34.20
    www.l.google.com.       83      IN      A       173.194.34.16
    
    ;; AUTHORITY SECTION:
    google.com.             68855   IN      NS      ns2.google.com.
    google.com.             68855   IN      NS      ns3.google.com.
    google.com.             68855   IN      NS      ns4.google.com.
    google.com.             68855   IN      NS      ns1.google.com.
    
    ;; Query time: 0 msec
    ;; SERVER: 127.0.0.1#53(127.0.0.1)
    ;; WHEN: Sat Apr  7 00:33:47 2012
    ;; MSG SIZE  rcvd: 204
    

    All modern browsers do see the multiple IPs in the DNS request and will try the next on the list if the first fails. FireFox (and probably others) caches which one eventually worked and uses it for future requests to that domain. Only the very first request do the down server will be slow while it figures it out, but the rest will be as normal. This of course assumes that the down server is completely unresponsive. If it gives an error code then the browser will think it's fine.

    Thanked by 3raindog308 lbft mrm2005
  • lbftlbft Member

    It's not HA but it helps (and it's the best you're likely to get with LEBs). It's not going to get you 100% site uptime but either of those reduces the impact of a failure a little bit:

    • Round robin DNS means that your requests will be roughly distributed across a number of IP addresses - so if, say, one of three servers goes down, two thirds of your visitors are still hitting the working servers (assuming the DNS cache lasts their entire browsing session; if it doesn't, then only 2/3 of their requests will succeed, although keepalive could improve that number).
    • Low-TTL round-robin DNS with automated changing - it reduces the outage window to the length of the caching of the response. Akamai does this plus geoip for their CDN, I think, so it must show some improvement.
    Thanked by 1raindog308
  • raindog308raindog308 Administrator, Veteran

    Thanks. I guess I wasn't aware that browsers get all address...though now that I pause to read the getaddrinfo page, it's plain ("getaddrinfo() returns one or more addrinfo structures"). Probably the same on Windows/Mac-based libraries.

  • MrAndroidMrAndroid Member
    edited April 2012

    @NickW said: All modern browsers do see the multiple IPs in the DNS request and will try the next on the list if the first fails. FireFox (and probably others) caches which one eventually worked and uses it for future requests to that domain. Only the very first request do the down server will be slow while it figures it out, but the rest will be as normal. This of course assumes that the down server is completely unresponsive. If it gives an error code then the browser will think it's fine.

    I thought its the OS that runs the DNS lookup, and the browser just simply gets the for it?
    >

    @raindog308 said: Thanks. I guess I wasn't aware that browsers get all address...though now that I pause to read the getaddrinfo page, it's plain ("getaddrinfo() returns one or more addrinfo structures"). Probably the same on Windows/Mac-based libraries.

    getaddrinfo() is POSIX, but a similar version also exist on Windows.

  • WHT had this debate on Round Robin DNS further along in this thread at http://www.webhostingtalk.com/showthread.php?t=1117385 nice read as it gets more technical further into the thread by folks have had hands on experience with it including how the browser deals with it.

    Thanked by 2yomero mrm2005
  • NickWNickW Member

    "WHT had this debate on Round Robin DNS further along in this thread at http://www.webhostingtalk.com/showthread.php?t=1117385 nice read as it gets more technical further into the thread by folks have had hands on experience with it including how the browser deals with it."

    "mugo" in that thread is wrong and pushing out of date crap on the subject. Maybe once upon a time 10+ years ago it was correct.

    "I thought its the OS that runs the DNS lookup, and the browser just simply gets the for it?"

    It does not matter either way as the OS's lookup is perfectly capable of returning multiple A records. For example, on Windows:

    C:\Users\Nick>nslookup www.youtube.com
    Server:  home
    Address:  192.168.1.1
    
    Non-authoritative answer:
    Name:    youtube-ui.l.google.com
    Addresses:  173.194.41.99
              173.194.41.104
              173.194.41.98
              173.194.41.102
              173.194.41.96
              173.194.41.105
              173.194.41.103
              173.194.41.101
              173.194.41.110
              173.194.41.97
              173.194.41.100
    Aliases:  www.youtube.com
    
  • Please always bear in mind that a number of recurrsive DNS servers do not respect TTL so DNS should never be used for HA.

  • Quick question, how do you do the green boxes?

  • NickWNickW Member
    edited April 2012

    I do a greater than symbol > then a space and then enclose everything in double quotes "". For a quotation.

    For a green box, put four extra spaces in front of every line.

  • NickWNickW Member
    edited April 2012

    If there's still any disbelievers, here's an ancient piece of Mozilla documentation http://www-archive.mozilla.org/docs/netlib/dns.html Read under the round robin support heading.

    If they managed to implement it god knows how many years ago I would assume all modern browsers have.

    Thanked by 1mrm2005
  • othelloRobothelloRob Member, Host Rep

    @raindog308 said:
    Round robin DNS is fine for load balancing, but not for HA

    It's not really any use for HA, as routers, browsers, isps etc cache the DNS result, and ignore your TTLs

    Thanked by 1marrco
  • lbftlbft Member
    edited April 2012

    Also, for those using Cloudflare, it will only try one IP per page load rather than using browser-like retry behaviour. The backend IP to use appears to be selected at random and, if it's down, it'll either serve a cached version or throw an error message.

  • CloudxtnyHostCloudxtnyHost Member, Host Rep

    Rather than doing round-robin you'd want a DNS server that had testing implemented within its core. For example 4psa DNS allows you to setup tests for records (for example ping this ip) and if the test fails the IP is removed from DNS.

    With powerdns you can do some very fancy custom coding to develop tests etc.

    The main problem with DNS is always going to be DNS Cache as @othellorob has pointed out.

Sign In or Register to comment.