Wan-Load-Balance rules to allow local traffic

I’ve got an issue with pdns not forwarding queries to external dns servers for hosts when load-balancing is enabled. see topic powerdns-recursor-returns-servfail-with-wan-load-balancer. I’m convinced its the rules but i can’t get it figured out.

As per the other topic linked above, it can forward dns requests to another server on the lan which doesn’t have balancing enabled (and instead goes direct through one of the 3 gateways), and it will return query results. But i’d rather not have another server just to forward requests through a single gateway.

Help to get this sorted is really appreciated, thanks.

Most likely WAN load-balancing doesn’t work well with locally generated traffic. For some time I use WAN load-balancing with some other VyOS (routing and firewall related) features on the same VyOS instance and I always stumble into “corner-cases”…

Here are some of my findings (I remember having issues with locally generated traffic and I decided against doing that):

Thanks for the info, that makes sense. I might play with the rules a bit and see what i can find out.

Thanks

This will most likely take some playing with on your part, but it is workable.

The most basic way I configure DNS when wan load balancing is a /32 static route per upstream DNS provider out of a selected interface, typically the same DNS server I use for interface health. I also only have 1 default route in the table for system traffic and any traffic not being load-balanced or overridden to an alternate WAN. I haven’t seen any issues doing this like what you are experiencing in the referenced post.

The more advanced way I’ve solved for this on other deployments was to enable local traffic on wan-load-balancing and create rules for the system traffic. Typically I do this so I can source OVPN tunnels from multiple ISP’s that use NAT(LTE and WISP’s are guilty of this) to the same end-point on different ports for redundancy, but it should also work for DNS. I don’t have any deployments active like this currently, but I know it does work for system traffic. Typically I still leave a singular default route in the table and only create load balancing rules for system traffic that needs to egress a secondary interface.

Quick example from my lab :
+enable-local-traffic
+rule 10 {

  • description “SYSTEM FOR DNS1”
  • destination {
  •    address 8.8.8.8
    
  •    port 53
    
  • }
  • inbound-interface lo
  • interface eth0 {
  •    weight 10
    
  • }
  • interface eth1 {
  •    weight 5
    
  • }
    +}
    +rule 20 {
  • description “SYSTEM FOR DNS2”
  • destination {
  •    address 8.8.4.4
    
  •    port 53
    
  • }
  • inbound-interface lo
  • interface eth0 {
  •    weight 5
    
  • }
  • interface eth1 {
  •    weight 10
    
  • }
    +}

Are you sure that if you define inbound-interface lo actually works?

(I mean sure, it generates the required iptables rules, but those won’t work as locally generated traffic goes out the OUTPUT chain not the FORWARD one where the WAN-LB creates them.)

Could you double check the iptables -t filter -S output?

Confirmed working in my test environment. If I remember correctly from past testing you don’t actually have to use lo for system balancing, but it helps to isolate the system traffic from any traffic you are servicing via the system.

vyos@vyos# show load-balancing wan
 enable-local-traffic
 interface-health eth0 {
     nexthop 10.10.20.1
     test 10 {
         target 10.10.10.1
         type ping
     }
 }
 interface-health eth1 {
     nexthop 10.10.20.5
     test 10 {
         target 10.10.10.5
         type ping
     }
 }
 interface-health eth2 {
     nexthop 10.10.20.9
     test 10 {
         target 10.10.10.9
         type ping
     }
 }
 rule 10 {
     destination {
         address 8.8.8.8
     }
     inbound-interface lo
     interface eth0 {
         weight 10
     }
 }
 rule 20 {
     destination {
         address 8.8.4.4
     }
     inbound-interface lo
     interface eth1 {
         weight 10
     }
 }
 rule 30 {
     destination {
         address 1.1.1.1
     }
     inbound-interface lo
     interface eth2 {
         weight 10
     }
 }
 rule 40 {
     inbound-interface lo
     interface eth0 {
         weight 10
     }
     interface eth1 {
         weight 10
     }
     interface eth2 {
         weight 10
     }
 }


vyos@vyos# run show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route

S>* 0.0.0.0/0 [1/0] via 10.10.20.1, eth0, 00:02:59
  *                 via 10.10.20.5, eth1, 00:02:59
  *                 via 10.10.20.9, eth2, 00:02:59
S>* 10.10.10.1/32 [1/0] via 10.10.20.1, eth0, 00:04:57
S>* 10.10.10.5/32 [1/0] via 10.10.20.5, eth1, 00:10:49
S>* 10.10.10.9/32 [1/0] via 10.10.20.9, eth2, 00:10:49
C>* 10.10.20.0/30 is directly connected, eth0, 00:20:04
C>* 10.10.20.4/30 is directly connected, eth1, 00:20:04
C>* 10.10.20.8/30 is directly connected, eth2, 00:20:04

vyos@vyos# run show wan-load-balance status
Chain WANLOADBALANCE_PRE (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 ISP_eth0   all  --  lo     *       0.0.0.0/0            8.8.8.8              state NEW
    0     0 CONNMARK   all  --  lo     *       0.0.0.0/0            8.8.8.8              CONNMARK restore
    0     0 ISP_eth1   all  --  lo     *       0.0.0.0/0            8.8.4.4              state NEW
    0     0 CONNMARK   all  --  lo     *       0.0.0.0/0            8.8.4.4              CONNMARK restore
    0     0 ISP_eth2   all  --  lo     *       0.0.0.0/0            1.1.1.1              state NEW
    0     0 CONNMARK   all  --  lo     *       0.0.0.0/0            1.1.1.1              CONNMARK restore
   10   699 ISP_eth0   all  --  lo     *       0.0.0.0/0            0.0.0.0/0            state NEW statistic mode random probability 0.33333300008
   17  1191 ISP_eth1   all  --  lo     *       0.0.0.0/0            0.0.0.0/0            state NEW statistic mode random probability 0.50000000000
   11   768 ISP_eth2   all  --  lo     *       0.0.0.0/0            0.0.0.0/0            state NEW
   38  3722 CONNMARK   all  --  lo     *       0.0.0.0/0            0.0.0.0/0            CONNMARK restore


vyos@vyos# sudo iptables -t filter -S
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
-N VYATTA_POST_FW_FWD_HOOK
-N VYATTA_POST_FW_IN_HOOK
-N VYATTA_POST_FW_OUT_HOOK
-N VYATTA_PRE_FW_FWD_HOOK
-N VYATTA_PRE_FW_IN_HOOK
-N VYATTA_PRE_FW_OUT_HOOK
-A INPUT -j VYATTA_PRE_FW_IN_HOOK
-A INPUT -j VYATTA_POST_FW_IN_HOOK
-A FORWARD -j VYATTA_PRE_FW_FWD_HOOK
-A FORWARD -j VYATTA_POST_FW_FWD_HOOK
-A OUTPUT -j VYATTA_PRE_FW_OUT_HOOK
-A OUTPUT -j VYATTA_POST_FW_OUT_HOOK
-A VYATTA_POST_FW_FWD_HOOK -j ACCEPT
-A VYATTA_POST_FW_IN_HOOK -j ACCEPT
-A VYATTA_POST_FW_OUT_HOOK -j ACCEPT
-A VYATTA_PRE_FW_FWD_HOOK -j RETURN
-A VYATTA_PRE_FW_IN_HOOK -j RETURN
-A VYATTA_PRE_FW_OUT_HOOK -j RETURN


vyos@vyos# traceroute 8.8.8.8
traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 60 byte packets
 1  10.10.20.1 (10.10.20.1)  0.648 ms  0.685 ms  0.384 ms
 2  10.10.10.1 (10.10.10.1)  1.016 ms  0.902 ms  0.949 ms
 3  192.168.122.1 (192.168.122.1)  1.069 ms  1.651 ms  4.158 ms
 4  192.168.220.2 (192.168.220.2)  3.912 ms  4.106 ms  7.379 ms^C
[edit]
vyos@vyos# traceroute 8.8.8.8
traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 60 byte packets
 1  10.10.20.1 (10.10.20.1)  1.167 ms  0.387 ms  0.369 ms
 2  10.10.10.1 (10.10.10.1)  1.317 ms  0.919 ms  1.259 ms
 3  192.168.122.1 (192.168.122.1)  1.481 ms  2.360 ms  2.190 ms
 4  192.168.220.2 (192.168.220.2)  9.797 ms  9.557 ms  8.544 ms^C
[edit]
vyos@vyos# traceroute 8.8.4.4
traceroute to 8.8.4.4 (8.8.4.4), 30 hops max, 60 byte packets
 1  10.10.20.5 (10.10.20.5)  0.887 ms  0.494 ms  0.349 ms
 2  10.10.10.5 (10.10.10.5)  2.130 ms  1.601 ms  2.018 ms
 3  192.168.122.1 (192.168.122.1)  3.642 ms  3.583 ms  3.537 ms
 4  192.168.220.2 (192.168.220.2)  2.617 ms  2.857 ms  3.362 ms^C
[edit]
vyos@vyos# traceroute 8.8.4.4
traceroute to 8.8.4.4 (8.8.4.4), 30 hops max, 60 byte packets
 1  10.10.20.5 (10.10.20.5)  1.027 ms  0.389 ms  0.414 ms
 2  10.10.10.5 (10.10.10.5)  1.399 ms  0.965 ms  1.055 ms
 3  192.168.122.1 (192.168.122.1)  2.273 ms  1.280 ms  1.295 ms
 4  192.168.220.2 (192.168.220.2)  1.554 ms  1.305 ms  1.965 ms^C
[edit]
vyos@vyos# traceroute 1.1.1.1
traceroute to 1.1.1.1 (1.1.1.1), 30 hops max, 60 byte packets
 1  10.10.20.9 (10.10.20.9)  1.056 ms  0.508 ms  0.381 ms
 2  10.10.10.9 (10.10.10.9)  1.038 ms  0.831 ms  1.934 ms
 3  192.168.122.1 (192.168.122.1)  2.842 ms  2.738 ms  2.381 ms
 4  192.168.220.2 (192.168.220.2)  4.454 ms  3.648 ms  2.812 ms^C
[edit]
vyos@vyos# traceroute 1.1.1.1
traceroute to 1.1.1.1 (1.1.1.1), 30 hops max, 60 byte packets
 1  10.10.20.9 (10.10.20.9)  0.511 ms  0.412 ms  0.344 ms
 2  10.10.10.9 (10.10.10.9)  1.031 ms  3.389 ms  4.590 ms
 3  192.168.122.1 (192.168.122.1)  6.687 ms  6.524 ms^C
[edit]

vyos@vyos# traceroute 4.2.2.2
traceroute to 4.2.2.2 (4.2.2.2), 30 hops max, 60 byte packets
 1  10.10.20.9 (10.10.20.9)  0.675 ms 10.10.20.5 (10.10.20.5)  0.809 ms 10.10.20.9 (10.10.20.9)  2.383 ms
 2  10.10.10.9 (10.10.10.9)  3.460 ms 10.10.10.5 (10.10.10.5)  9.580 ms 10.10.10.9 (10.10.10.9)  3.304 ms
 3  192.168.122.1 (192.168.122.1)  9.217 ms  8.753 ms  9.066 ms
 4  192.168.220.2 (192.168.220.2)  8.587 ms  8.544 ms  8.895 ms^C
[edit]
vyos@vyos# traceroute 4.2.2.2
traceroute to 4.2.2.2 (4.2.2.2), 30 hops max, 60 byte packets
 1  10.10.20.1 (10.10.20.1)  0.629 ms 10.10.20.5 (10.10.20.5)  0.482 ms  1.295 ms
 2  10.10.10.5 (10.10.10.5)  6.748 ms  6.667 ms 10.10.10.9 (10.10.10.9)  6.432 ms
 3  192.168.122.1 (192.168.122.1)  6.300 ms  6.246 ms  5.833 ms
 4  192.168.220.2 (192.168.220.2)  5.736 ms  5.651 ms  5.936 ms^C
[edit]
vyos@vyos# traceroute 4.2.2.2
traceroute to 4.2.2.2 (4.2.2.2), 30 hops max, 60 byte packets
 1  10.10.20.5 (10.10.20.5)  0.891 ms  1.661 ms  1.407 ms
 2  10.10.10.9 (10.10.10.9)  1.555 ms  3.987 ms 10.10.10.1 (10.10.10.1)  8.570 ms
 3  192.168.122.1 (192.168.122.1)  7.953 ms  7.515 ms  7.228 ms
 4  192.168.220.2 (192.168.220.2)  7.041 ms  6.598 ms  6.303 ms^C

Specifically for the root of this topic DNS also has no issues with resolution :

vyos@vyos# show service dns
 forwarding {
     dnssec process-no-validate
     listen-address 127.0.0.1
     name-server 8.8.8.8
     name-server 8.8.4.4
     name-server 1.1.1.1
 }
[edit]


vyos@vyos# dig @127.0.0.1 A google.com facebook.com yahoo.com gmail.com +short
172.217.6.142
31.13.93.35
72.30.35.9
98.137.246.7
98.138.219.232
72.30.35.10
98.138.219.231
98.137.246.8
216.58.194.133

Could you also check iptables -t mangle -S? (I forgot that WAN-LB actually creates mangle rules…)

However, in your case the failover is handled by using a different resolver IP if one of the ethX goes down. (I.e. your DNS queries aren’t actually “load-balanced” but each IP is tied to a different interface; although if one is using Google or CloudFlare resolvers, sticking one IP to one interface yields the same results.)

I know they’re not load balanced with the example configuration pining them to a single interface, it was just a simple example. If the resolver in use was 4.2.2.2 (Level3) then it would be per the examples. Anyhow I haven’t destroyed the test scenario yet so here’s the full iptables dump before I do if you’re interested in whats being created.

vyos@vyos# sudo iptables-save
# Generated by iptables-save v1.4.21 on Fri May  3 20:32:00 2019
*mangle
:PREROUTING ACCEPT [1591:114762]
:INPUT ACCEPT [1679:120906]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [101:7571]
:POSTROUTING ACCEPT [1881:120525]
:ISP_eth0 - [0:0]
:ISP_eth1 - [0:0]
:ISP_eth2 - [0:0]
:WANLOADBALANCE_OUT - [0:0]
:WANLOADBALANCE_PRE - [0:0]
-A PREROUTING -j WANLOADBALANCE_PRE
-A OUTPUT -j WANLOADBALANCE_OUT
-A ISP_eth0 -j CONNMARK --set-xmark 0xc9/0xffffffff
-A ISP_eth0 -j MARK --set-xmark 0xc9/0xffffffff
-A ISP_eth0 -j ACCEPT
-A ISP_eth1 -j CONNMARK --set-xmark 0xca/0xffffffff
-A ISP_eth1 -j MARK --set-xmark 0xca/0xffffffff
-A ISP_eth1 -j ACCEPT
-A ISP_eth2 -j CONNMARK --set-xmark 0xcb/0xffffffff
-A ISP_eth2 -j MARK --set-xmark 0xcb/0xffffffff
-A ISP_eth2 -j ACCEPT
-A WANLOADBALANCE_OUT -m mark ! --mark 0x0 -j ACCEPT
-A WANLOADBALANCE_OUT -p icmp -m icmp --icmp-type any -j ACCEPT
-A WANLOADBALANCE_OUT -s 127.0.0.0/8 -d 127.0.0.0/8 -j ACCEPT
-A WANLOADBALANCE_OUT -d 8.8.8.8/32 ! -o lo -m state --state NEW -j ISP_eth0
-A WANLOADBALANCE_OUT -d 8.8.8.8/32 ! -o lo -j CONNMARK --restore-mark --nfmask 0xffffffff --ctmask 0xffffffff
-A WANLOADBALANCE_OUT -d 8.8.4.4/32 ! -o lo -m state --state NEW -j ISP_eth1
-A WANLOADBALANCE_OUT -d 8.8.4.4/32 ! -o lo -j CONNMARK --restore-mark --nfmask 0xffffffff --ctmask 0xffffffff
-A WANLOADBALANCE_OUT -d 1.1.1.1/32 ! -o lo -m state --state NEW -j ISP_eth2
-A WANLOADBALANCE_OUT -d 1.1.1.1/32 ! -o lo -j CONNMARK --restore-mark --nfmask 0xffffffff --ctmask 0xffffffff
-A WANLOADBALANCE_OUT ! -o lo -m state --state NEW -m statistic --mode random --probability 0.33333300008 -j ISP_eth0
-A WANLOADBALANCE_OUT ! -o lo -m state --state NEW -m statistic --mode random --probability 0.50000000000 -j ISP_eth1
-A WANLOADBALANCE_OUT ! -o lo -m state --state NEW -j ISP_eth2
-A WANLOADBALANCE_OUT ! -o lo -j CONNMARK --restore-mark --nfmask 0xffffffff --ctmask 0xffffffff
-A WANLOADBALANCE_PRE -d 8.8.8.8/32 -i lo -m state --state NEW -j ISP_eth0
-A WANLOADBALANCE_PRE -d 8.8.8.8/32 -i lo -j CONNMARK --restore-mark --nfmask 0xffffffff --ctmask 0xffffffff
-A WANLOADBALANCE_PRE -d 8.8.4.4/32 -i lo -m state --state NEW -j ISP_eth1
-A WANLOADBALANCE_PRE -d 8.8.4.4/32 -i lo -j CONNMARK --restore-mark --nfmask 0xffffffff --ctmask 0xffffffff
-A WANLOADBALANCE_PRE -d 1.1.1.1/32 -i lo -m state --state NEW -j ISP_eth2
-A WANLOADBALANCE_PRE -d 1.1.1.1/32 -i lo -j CONNMARK --restore-mark --nfmask 0xffffffff --ctmask 0xffffffff
-A WANLOADBALANCE_PRE -i lo -m state --state NEW -m statistic --mode random --probability 0.33333300008 -j ISP_eth0
-A WANLOADBALANCE_PRE -i lo -m state --state NEW -m statistic --mode random --probability 0.50000000000 -j ISP_eth1
-A WANLOADBALANCE_PRE -i lo -m state --state NEW -j ISP_eth2
-A WANLOADBALANCE_PRE -i lo -j CONNMARK --restore-mark --nfmask 0xffffffff --ctmask 0xffffffff
COMMIT
# Completed on Fri May  3 20:32:00 2019
# Generated by iptables-save v1.4.21 on Fri May  3 20:32:00 2019
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [331:21607]
:POSTROUTING ACCEPT [91:6376]
:VYATTA_PRE_DNAT_HOOK - [0:0]
:VYATTA_PRE_SNAT_HOOK - [0:0]
:WANLOADBALANCE - [0:0]
-A PREROUTING -j VYATTA_PRE_DNAT_HOOK
-A POSTROUTING -j VYATTA_PRE_SNAT_HOOK
-A VYATTA_PRE_DNAT_HOOK -j RETURN
-A VYATTA_PRE_SNAT_HOOK -j WANLOADBALANCE
-A VYATTA_PRE_SNAT_HOOK -j RETURN
-A WANLOADBALANCE -m connmark --mark 0xc9 -j SNAT --to-source 10.10.20.2
-A WANLOADBALANCE -m connmark --mark 0xca -j SNAT --to-source 10.10.20.6
-A WANLOADBALANCE -m connmark --mark 0xcb -j SNAT --to-source 10.10.20.10
COMMIT
# Completed on Fri May  3 20:32:00 2019
# Generated by iptables-save v1.4.21 on Fri May  3 20:32:00 2019
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:VYATTA_POST_FW_FWD_HOOK - [0:0]
:VYATTA_POST_FW_IN_HOOK - [0:0]
:VYATTA_POST_FW_OUT_HOOK - [0:0]
:VYATTA_PRE_FW_FWD_HOOK - [0:0]
:VYATTA_PRE_FW_IN_HOOK - [0:0]
:VYATTA_PRE_FW_OUT_HOOK - [0:0]
-A INPUT -j VYATTA_PRE_FW_IN_HOOK
-A INPUT -j VYATTA_POST_FW_IN_HOOK
-A FORWARD -j VYATTA_PRE_FW_FWD_HOOK
-A FORWARD -j VYATTA_POST_FW_FWD_HOOK
-A OUTPUT -j VYATTA_PRE_FW_OUT_HOOK
-A OUTPUT -j VYATTA_POST_FW_OUT_HOOK
-A VYATTA_POST_FW_FWD_HOOK -j ACCEPT
-A VYATTA_POST_FW_IN_HOOK -j ACCEPT
-A VYATTA_POST_FW_OUT_HOOK -j ACCEPT
-A VYATTA_PRE_FW_FWD_HOOK -j RETURN
-A VYATTA_PRE_FW_IN_HOOK -j RETURN
-A VYATTA_PRE_FW_OUT_HOOK -j RETURN
COMMIT
# Completed on Fri May  3 20:32:00 2019
# Generated by iptables-save v1.4.21 on Fri May  3 20:32:00 2019
*raw
:PREROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:VYATTA_CT_HELPER - [0:0]
:VYATTA_CT_IGNORE - [0:0]
:VYATTA_CT_OUTPUT_HOOK - [0:0]
:VYATTA_CT_PREROUTING_HOOK - [0:0]
:VYATTA_CT_TIMEOUT - [0:0]
:WLB_CONNTRACK - [0:0]
-A PREROUTING -j VYATTA_CT_IGNORE
-A PREROUTING -j VYATTA_CT_TIMEOUT
-A PREROUTING -j VYATTA_CT_PREROUTING_HOOK
-A PREROUTING -j WLB_CONNTRACK
-A PREROUTING -j NOTRACK
-A OUTPUT -j VYATTA_CT_IGNORE
-A OUTPUT -j VYATTA_CT_TIMEOUT
-A OUTPUT -j VYATTA_CT_OUTPUT_HOOK
-A OUTPUT -j WLB_CONNTRACK
-A OUTPUT -j NOTRACK
-A VYATTA_CT_HELPER -p tcp -m tcp --dport 1536 -j CT --helper tns
-A VYATTA_CT_HELPER -p tcp -m tcp --dport 1525 -j CT --helper tns
-A VYATTA_CT_HELPER -p tcp -m tcp --dport 1521 -j CT --helper tns
-A VYATTA_CT_HELPER -p udp -m udp --dport 111 -j CT --helper rpc
-A VYATTA_CT_HELPER -p tcp -m tcp --dport 111 -j CT --helper rpc
-A VYATTA_CT_HELPER -j RETURN
-A VYATTA_CT_IGNORE -j RETURN
-A VYATTA_CT_OUTPUT_HOOK -j RETURN
-A VYATTA_CT_PREROUTING_HOOK -j RETURN
-A VYATTA_CT_TIMEOUT -j RETURN
-A WLB_CONNTRACK -j ACCEPT
COMMIT
# Completed on Fri May  3 20:32:01 2019
[edit]

The resulting rules are quite strange… They seem to use ! -o lo (i.e. not) while your configuration states that the inbound-interface lo. But indeed they should do the trick.

Thanks for the full iptables dump.