Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


IPv6 Neighbor Discovery Responder for KVM VPS
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

IPv6 Neighbor Discovery Responder for KVM VPS

yoursunnyyoursunny Member, IPv6 Advocate

This article is originally published on yoursunny.com blog https://yoursunny.com/t/2021/ndpresponder/

I Want IPv6 for Docker

I'm playing with Docker these days, and I want IPv6 in my Docker containers.
The best guide for enabling IPv6 in Docker is how to enable IPv6 for Docker containers on Ubuntu 18.04.
The first method in that article assigns private IPv6 addresses to containers, and uses IPv6 NAT similar to how Docker handles IPv4 NAT.
I quickly got it works, but I noticed an undesirable behavior: Network Address Translation (NAT) changes the source port number of outgoing UDP datagrams, even if there's a port forwarding rule for inbound traffic; consequently, a UDP flow with the same source and destination ports is being recognized as two separate flows.

$ docker exec nfd nfdc face show 262
    faceid=262
    remote=udp6://[2001:db8:f440:2:eb26:f0a9:4dc3:1]:6363
     local=udp6://[fd00:2001:db8:4d55:0:242:ac11:4]:6363
congestion={base-marking-interval=100ms default-threshold=65536B}
       mtu=1337
  counters={in={25i 4603d 2n 1179907B} out={11921i 14d 0n 1506905B}}
     flags={non-local permanent point-to-point congestion-marking}
$ docker exec nfd nfdc face show 270
    faceid=270
    remote=udp6://[2001:db8:f440:2:eb26:f0a9:4dc3:1]:1024
     local=udp6://[fd00:2001:db8:4d55:0:242:ac11:4]:6363
   expires=0s
congestion={base-marking-interval=100ms default-threshold=65536B}
       mtu=1337
  counters={in={11880i 0d 0n 1498032B} out={0i 4594d 0n 1175786B}}
     flags={non-local on-demand point-to-point congestion-marking}

The second method in that article allows every container to have a public IPv6 address.
It avoids NAT and the problems that come with it, but requires the host to have a routed IPv6 subnet.
However, routed IPv6 is hard to come by on KVM servers, because virtualization platform such as Virtualizor does not support routed IPv6 subnets, but can only provide on-link IPv6.

On-Link IPv6 vs Routed IPv6

So what's the difference between on-link IPv6 and routed IPv6, anyway?
It differs in how the router at the previous hop is configured to reach a destination IP address.

Let me explain in IPv4 terms first:

|--------| 192.0.2.1/24       |--------| 198.51.100.1/24    |-----------|
| router |--------------------| server |--------------------| container |
|--------|       192.0.2.2/24 |--------|    198.51.100.2/24 |-----------|
            (192.0.2.16-23/24)    |
                                  | 192.0.2.17/28           |-----------|
                                  \-------------------------| container |
                                              192.0.2.18/28 |-----------|
  • The server has on-link IP address 192.0.2.2.

    • The router knows this IP address is on-link because it is in the 192.0.2.0/24 subnet that is configured on the router interface.
    • To deliver a packet to 192.0.2.2, the router sends an ARP query of 192.0.2.2 to learn the server's MAC address, which should be responded by the server.
  • The server has routed IP subnet 198.51.100.0/24.

    • The router must be configured to know: 198.51.100.0/24 is reachable via 192.0.2.2.
    • To deliver a packet to 198.51.100.2, the router first queries its routing table and finds the above entry, then sends an ARP query to learn the MAC address of 192.0.2.2 which should be responded by the server, and finally delivers the packet to the learned MAC address.
  • The main difference is what IP address is enclosed in the ARP query:

    • If the destination IP address is an on-link IP address, the ARP query contains the destination IP address itself.
    • If the destination IP address is in a routed subnet, the ARP query contains the nexthop IP address, as determined by the routing table.
  • If I want to assign an on-link IPv4 address (e.g. 192.0.2.9/28) to a container, the server should be made to answer ARP queries for that IP address so that the router would deliver packets to the server, and then forwards these packets to the container.

    • This technique is called ARP proxy, in which the server responds to ARP queries on behalf of the container.

The situation is a bit more complex in IPv6 because each network interface can have multiple IPv6 addresses, but the same concept applies.
Instead of Address Resolution Protocol (ARP), IPv6 uses Neighbor Discovery Protocol that is part of ICMPv6.
A few terminology differs:

IPv4 IPv6
ARP Neighbor Discovery Protocol (NDP)
ARP query ICMPv6 Neighbor Solicitation
ARP reply ICMPv6 Neighbor Advertisement
ARP proxy NDP proxy

If I want to assign an on-link IPv6 address to a container, the server should respond to neighbor solicitations for that IP address, so that the router would deliver packets to the server.
After that, the server's Linux kernel could route the packet to the container's bridge, as if the destination IPv6 address was in a routed subnet.

NDP Proxy Daemon to the Rescue, I Hope?

ndppd, or NDP Proxy Daemon, is a program that listens for neighbor solicitations on a network interface and responds with neighbor advertisements.
It is often recommended for dealing with the scenario when the server has only on-link IPv6 but we need a routed IPv6 subnet.

I installed ndppd on one of my servers, and it worked as expected with this configuration:

proxy uplink {
  rule 2001:db8:fbc0:2:646f:636b:6572::/112 {
    auto
  }
}

I can start up a Docker container with a public IPv6 address.
It can reach the IPv6 Internet, and can be ping-ed from outside.

$ docker network create --ipv6 --subnet=172.26.0.0/16
  --subnet=2001:db8:fbc0:2:646f:636b:6572::/112 ipv6exposed
118c3a9e00595262e41b8cb839a55d1bc7bc54979a1ff76b5993273d82eea1f4

$ docker run -it --rm --network ipv6exposed
  --ip6 2001:db8:fbc0:2:646f:636b:6572:d002 alpine

# wget -q -O- https://www.cloudflare.com/cdn-cgi/trace | grep ip
ip=2001:db8:fbc0:2:646f:636b:6572:d002

However, when I repeated the same setup on another KVM server, things didn't go well: the container cannot reach the IPv6 Internet at all.

$ docker run -it --rm --network ipv6exposed
  --ip6 2001:db8:f440:2:646f:636b:6572:d003 alpine

/ # ping -c 4 ipv6.google.com
PING ipv6.google.com (2607:f8b0:400a:809::200e): 56 data bytes

--- ipv6.google.com ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss

What's Wrong with ndppd?

Why ndppd works on the first server, but does not work on the second server?
What's the difference?
We need to go deeper, so I turned to tcpdump.

On the first server, I see:

$ sudo tcpdump -pi uplink icmp6
19:13:17.958191 IP6 2001:db8:fbc0::1 > ff02::1:ff72:d002:
    ICMP6, neighbor solicitation, who has 2001:db8:fbc0:2:646f:636b:6572:d002, length 32
19:13:17.958472 IP6 2001:db8:fbc0:2::2 > 2001:db8:fbc0::1:
    ICMP6, neighbor advertisement, tgt is 2001:db8:fbc0:2:646f:636b:6572:d002, length 32
  • The neighbor solicitation from the router comes from a global IPv6 address.
  • The server responds with a neighbor advertisement from its global IPv6 address.
    Note that this address differs from the container's address.

  • IPv6 works in the container.

On the second server, I see:

$ sudo tcpdump -pi uplink icmp6
00:07:53.617438 IP6 fe80::669d:99ff:feb1:55b8 > ff02::1:ff72:d003:
    ICMP6, neighbor solicitation, who has 2001:db8:f440:2:646f:636b:6572:d003, length 32
00:07:53.617714 IP6 fe80::216:3eff:fedd:7c83 > fe80::669d:99ff:feb1:55b8:
    ICMP6, neighbor advertisement, tgt is 2001:db8:f440:2:646f:636b:6572:d003, length 32
  • The neighbor solicitation from the router comes from a link-local IPv6 address.
  • The server responds with a neighbor advertisement from its link-local IPv6 address.
  • IPv6 does not work in the container.

Since IPv6 has been working on the second server for IPv6 addresses assigned to the server itself, I added a new IPv6 address and captured its NDP exchange:

$ sudo tcpdump -pi uplink icmp6
00:29:39.378544 IP6 fe80::669d:99ff:feb1:55b8 > ff02::1:ff00:a006:
    ICMP6, neighbor solicitation, who has 2001:db8:f440:2::a006, length 32
00:29:39.378581 IP6 2001:db8:f440:2::a006 > fe80::669d:99ff:feb1:55b8:
    ICMP6, neighbor advertisement, tgt is 2001:db8:f440:2::a006, length 32
  • The neighbor solicitation from the router comes from a link-local IPv6 address, same as above.
  • The server responds with a neighbor advertisement from the target global IPv6 address.
  • IPv6 works on the server from this address.

In IPv6, each network interface can have multiple IPv6 addresses.
When the Linux kernel responds to a neighbor solicitation in which the target address is assigned to the same network interface, it uses that particular address as the source address.
On the other hand, ndppd transmits neighbor advertisements via a PF_INET6 socket and does not specify the source address.
In this case, some complicated rules for default address selection come into play.

One of these rules is preferring a source address that has the same scope as the destination address (i.e. the router).
On my first server, the router uses a global address, and the server selects a global address as the source address on its neighbor advertisement.
On my second server, the router uses a link-local address, and the server selects a link-local address, too.

In an unfiltered network, the router wouldn't care where the neighbor advertisements come from.
However, when it comes to a KVM server on Virtualizor, the hypervisor would treat such packets as attempted IP spoofing attacks, and drop them via ebtables rules.
Consequently, the neighbor advertisement never reaches the router, and the router has no way to know how to reach the container's IPv6 address.

ndpresponder: NDP Responder for KVM VPS

I tried a few tricks such as deprecating the link-local address, but none of them worked.
Thus, I made my own NDP responder that sends neighbor advertisements from the target address.

ndpresponder is a Go program using the GoPacket library.

  1. The program opens an AF_PACKET socket, with a BPF filter for ICMPv6 neighbor solicitation messages.
  2. When a neighbor solicitation arrives, it checks the target address against a user-supplied IP range.
  3. If the target address is in the range used for Docker containers, the program constructs an ICMPv6 neighbor advertisement messages and transmits it through the same AF_PACKET socket.

A major difference from ndppd is that, the source IPv6 address on a neighbor advertisement message is always set to the same value as the target address of the neighbor solicitation, so that the message wouldn't be dropped by the hypervisor.
This is made possible because I'm sending the message via an AF_PACKET socket, instead of the AF_INET6 socket used by ndppd.

ndpresponder operates similarly as ndppd in "static" mode.
It does not forward neighbor advertisements to the destination subnet like ndppd does in its "auto" mode, but this feature isn't important on a KVM server.

If ndppd doesn't seem to work on your KVM VPS, give ndpresponder a try!
Head to my GitHub repository for installation and usage instructions:
https://github.com/yoursunny/ndpresponder

Comments

  • Nice write-up!

    Yet there are easier hacks to achieve that. First, if you need to proxy only a few addresses or a small subnet, you can use the Linux kernel's neighbour discovery proxy, e.g.,

    sysctl net.ipv6.conf.eth0.proxy_ndp=1
    ip -6 neigh add proxy 2001:db8::1234 dev eth0 nud permanent
    

    Second, to work around the Virtualizor ebtables rule, you can use ip6tables' SNPT, e.g.,

    ip6tables -t mangle -I POSTROUTING -o eth0 -p icmpv6 --icmpv6-type neighbour-advertisement -j SNPT --src-pfx fe80::/64 --dst-pfx 2001:db8::/64

    (This will make your neighbour advertisements originate from an address that is derived from your MAC address, but that's within your public subnet, and neither Virtualizor nor the router cares.)

    Finally, because the SNPT rule works on packets sent by the kernel ndp proxy, you can combine these commands, and achieve your goal without any userspace daemon or special tools.

    Thanked by 2yoursunny klikli
  • yoursunnyyoursunny Member, IPv6 Advocate

    @psb777 said:
    Yet there are easier hacks to achieve that.

    I tend to do things the hard way ™️ .

    The article was written on April 10.
    Since then, I discovered that if I delete IPv6 link-local address on the interface (not just setting them as deprecated), ndppd would work as well.
    In Netplan, this can be achieved with link-local: [].

    if you need to proxy only a few addresses or a small subnet, you can use the Linux kernel's neighbour discovery proxy

    Yes, this is described in the first reference.
    Each command can set one address only, not a subset.

    This doesn't integrate well with Docker: docker run cannot execute those commands automatically.
    It's possible to run a separate container that monitors container creations and set proxy rules accordingly.

    you can use ip6tables' SNPT

    It's the second thing I tried, but I'm too dumb to come up with the command line.
    Or is it UFW interfering?

    Thanked by 1psb777
  • @yoursunny said: I tend to do things the hard way ™️ .

    Hey, wait a minute there! I thought that was my trademark :D

    Sometimes I both voluntarily and involuntarily build myself intricate mazes of hoops to jump through; too bad one can only see this in hindsight. On a positive note though, this oftentimes does have the nice side effect of one learning new interesting things.

    Thanked by 1yoursunny
  • mcgreemcgree Member
    edited October 2021

    I hope you can help me to answer some of the questions I don't understand?

    Both packet captures use ndppd software for NDP proxy.

    ⭐ = Same packet 😂 = Other Data 😊 = Neighborhood Discovery

    I see from your blog that he is sending his own Link Local address, but I found in my packet capture that he is sending his own global routable address, except that even with this difference, he is not routable.

    Normal running tcpdump:

    ⭐ IP6 [Docker Prefix]::2 > ff02::1:ff00:1: ICMP6, neighbor solicitation, who has [Docker Prefix]::1, length 32
    ⭐ IP6 [Docker Prefix]::1 > [Docker Prefix]::2: ICMP6, neighbor advertisement, tgt is [Docker Prefix]::1, length 32
    😂 IP6 [Docker Prefix]::2 > dns.google: ICMP6, echo request, id 1, seq 0, length 64
    ⭐ IP6 :: > ff02::1:ff11:2: ICMP6, neighbor solicitation, who has fe80::42:acff:fe11:2, length 32
    😂 IP6 dns.google > [Docker Prefix]::2: ICMP6, echo reply, id 1, seq 0, length 64
    😂 IP6 [Docker Prefix]::2 > dns.google: ICMP6, echo request, id 1, seq 1, length 64
    😂 IP6 dns.google > [Docker Prefix]::2: ICMP6, echo reply, id 1, seq 1, length 64
    ⭐ IP6 fe80::42:acff:fe11:2 > ip6-allrouters: ICMP6, router solicitation, length 16
    😂 IP6 [Docker Prefix]::2 > dns.google: ICMP6, echo request, id 1, seq 2, length 64
    😂 IP6 dns.google > [Docker Prefix]::2: ICMP6, echo reply, id 1, seq 2, length 64
    😂 IP6 [Docker Prefix]::2 > dns.google: ICMP6, echo request, id 1, seq 3, length 64
    😂 IP6 dns.google > [Docker Prefix]::2: ICMP6, echo reply, id 1, seq 3, length 64

    ⭐ IP6 [Docker Prefix]::2 > ff02::1:ff00:1: ICMP6, neighbor solicitation, who has [Docker Prefix]::1, length 32
    ⭐ IP6 [Docker Prefix]::1 > [Docker Prefix]::2: ICMP6, neighbor advertisement, tgt is [Docker Prefix]::1, length 32
    😂 IP6 [Docker Prefix]::2 > dns.google: ICMP6, echo request, id 1, seq 0, length 64
    ⭐ IP6 :: > ff02::1:ff11:2: ICMP6, neighbor solicitation, who has fe80::42:acff:fe11:2, length 32
    😂 IP6 [Docker Prefix]::2 > dns.google: ICMP6, echo request, id 1, seq 1, length 64
    ⭐ IP6 fe80::42:acff:fe11:2 > ip6-allrouters: ICMP6, router solicitation, length 16
    😂 IP6 [Docker Prefix]::2 > dns.google: ICMP6, echo request, id 1, seq 2, length 64
    😂 IP6 [Docker Prefix]::2 > dns.google: ICMP6, echo request, id 1, seq 3, length 64
    😊 IP6 fe80::1 > [Docker Prefix]::2: ICMP6, neighbor solicitation, who has [Docker Prefix]::2, length 32
    😊 [Docker Prefix]::2 > fe80::1: ICMP6, neighbor advertisement, tgt is [Docker Prefix]::2, length 24
    😊 IP6 fe80::42:acff:fe11:2 > ip6-allrouters: ICMP6, router solicitation, length 16
    😊 IP6 fe80::42:acff:fe11:2 > fe80::1: ICMP6, neighbor solicitation, who has fe80::1, length 32
    😊 IP6 fe80::1 > fe80::42:acff:fe11:2: ICMP6, neighbor advertisement, tgt is fe80::1, length 24

Sign In or Register to comment.