Policy Routing Not Working Wireguard

Hi,

I have an issue where I have set policy routing rules to route based on source ip or destination ip over a wireguard vpn.

The issue is on a reboot the routing doesnt work. After much troubleshooting it seems the only way i can get it to work is to run these commands

del static table 20 interface-route
commit
set static table 20 interface-route 0.0.0.0/0 next-hop-interface wg2
commit

(where wg2 is my wireguard interface)

this will work 100% until the next reboot.

Any ideas?

Log a ticket, it sounds like you’ve found a reproducible bug!

Hi @phasma , could you please share VyOS version you’re facing this issue as well as:

show configuration commands | strip-private

cat /var/log/messages | strip-private (after the router bootup process)

Thanks.

@phasma as I could see there is same configuration for 2 wg interfaces, does this issue applies to both or policy route for wg1 works fine after the reboot? Also please provide the requested logs whenever possible, that also might help to understand why the issue happens. Thanks.

Looking for alternative solutions to you situation.
Could you define table 20 and table 30, using route, and no interface route?

set protocols static table 20 route xxx.xxx.0.0/0 next-hop remote_wireguard_IP_WG2
set protocols static table 30 route xxx.xxx.0.0/0 next-hop remote_wireguard_IP_WG1

The remote IP’s often change so I dont think this wouldnt work.

@phasma

We tried to reproduce your issue, but after reboot, everything is working as expected in our lab.
Some things to consider:

  1. How did you get to 1.3-rc6 version? Upgraded from previous version, or it was a fresh install?
  2. Could you make a new reboot, and before applying your commit that restores functionality, type “compare” and share the result of it? In the logs provided, we couldn’t find when policy POL-ROUTE-ETH1 was loaded.

Just to add to the conversation, I’m running 1.3 RC 6 with policy based routing with multiple WireGuard instances and interface defined routes and have not experienced any issues after rebooting. I am running a site to site, Mullvad client, and server (road warrior) and everything has come back up as expected after multiple reboots.

Hi,

I upgraded to 1.3 rc 6 via add system image.

Here is the process.

tracert before the fix on a fresh reboot

Tracing route to www.pandora.com [208.85.40.158]
over a maximum of 3 hops:

  1    <1 ms    <1 ms    <1 ms  vyos.xxxxxxxxxxx[192.168.83.254]
  2     4 ms     4 ms     4 ms  vt1.cor2.lond1.ptn.zen.net.uk [51.148.72.22]
  3     4 ms     4 ms     4 ms  lag-9.p1.thn-lon.zen.net.uk [51.148.73.160]

Deleting the routing interface. Followed by a traceroute to show no change

vyos@vyos# compare
[edit protocols static table 20]
-interface-route 0.0.0.0/0 {
-    next-hop-interface wg2 {
-    }
-}
[edit protocols static table 30]
-interface-route 0.0.0.0/0 {
-    next-hop-interface wg1 {
-    }
-}
[edit protocols static]

Tracing route to www.pandora.com [208.85.40.158]
over a maximum of 3 hops:

  1    <1 ms    <1 ms    <1 ms  vyos.xxxxxxxxxxxxxxxx [192.168.83.254]
  2     4 ms     4 ms     4 ms  vt1.cor2.lond1.ptn.zen.net.uk [51.148.72.22]
  3     4 ms     4 ms     4 ms  lag-9.p1.thn-lon.zen.net.uk [51.148.73.160]

Readding back the wireguard interface followed by a traceroute showing it working.


vyos@vyos# compare
[edit protocols static table 20]
+interface-route 0.0.0.0/0 {
+    next-hop-interface wg2 {
+    }
+}
[edit protocols static table 30]
+interface-route 0.0.0.0/0 {
+    next-hop-interface wg1 {
+    }
+}
[edit protocols static]


Tracing route to www.pandora.com [208.85.40.158]
over a maximum of 3 hops:

  1    <1 ms    <1 ms    <1 ms  vyos.xxxxxxxxx [192.168.83.254]
  2    77 ms    76 ms    76 ms  10.13.0.1
  3    77 ms    77 ms    77 ms  te0-7-0-19.rcr22.b001362-2.jfk01.atlas.cogentco.com [38.142.116.241]

Also just tried 1.3 rc 5 as still had that image same issue

What does actual linux route table look like in error condition?
sudo ip route show table 20

Hello, @phasma!

To collect necessary debug information, you need to share the output of the next commands at the moment when PBR works and when not:

sudo ip rule show
sudo ip r show table 20
sudo ip r show table 30
sudo ip r get [DST_ADDR] mark [PBR_MARK]
sudo nft list table ip mangle

where:
[DST_ADDR] - a destination address traffic to which should be routed via wg interfaces
[PBR_MARK] - a mark from the sudo ip rule show output. There should be two of them.

This will give a start point for investigation

Rebooted Fresh and not working…

vyos:[~] $ sudo ip rule show
0:      from all lookup local
20:     from all fwmark 0x80000013 lookup 20
30:     from all fwmark 0x8000001d lookup 30
32766:  from all lookup main
32767:  from all lookup default

sudo ip r show table 20
[Nothing returned]

sudo ip r show table 30
[Nothing returned]

vyos:[~] $ sudo ip r get 208.85.40.158 mark 0x80000013
208.85.40.158 dev pppoe0 src 82.69.85.101 mark 0x80000013 uid 0
    cache

vyos:[~] $ sudo ip r get 95.211.189.152 mark 0x8000001d
95.211.189.152 dev pppoe0 src 82.69.85.101 mark 0x8000001d uid 0
    cache

table ip mangle {
        chain PREROUTING {
                type filter hook prerouting priority mangle; policy accept;
                counter packets 102454 bytes 55985143 jump VYATTA_FW_IN_HOOK
        }

        chain INPUT {
                type filter hook input priority mangle; policy accept;
        }

        chain FORWARD {
                type filter hook forward priority mangle; policy accept;
        }

        chain OUTPUT {
                type route hook output priority mangle; policy accept;
                counter packets 1730 bytes 187385 jump VYATTA_FW_LOCALOUT_HOOK
        }

        chain POSTROUTING {
                type filter hook postrouting priority mangle; policy accept;
                counter packets 100328 bytes 55611817 jump VYATTA_FW_OUT_HOOK
        }

        chain VYATTA_FW_OUT_HOOK {
        }

        chain VYATTA_FW_IN_HOOK {
                iifname "eth1" counter packets 55554 bytes 48513250 jump POL-ROUTE-ETH1
        }

        chain VYATTA_FW_LOCALOUT_HOOK {
        }

        chain POL-ROUTE-ETH1 {
                # match-set NET-STREAMING-DISNEY dst counter packets 0 bytes 0 jump VYATTA_PBR_20 comment "POL-ROUTE-ETH1-10"
                # match-set NET-STREAMING-PANDORA dst counter packets 86 bytes 5160 jump VYATTA_PBR_20 comment "POL-ROUTE-ETH1-11"
                # match-set ADR-ROUTE-TG-USA-DEDI-NY src counter packets 0 bytes 0 jump VYATTA_PBR_20 comment "POL-ROUTE-ETH1-12"
                # match-set NET-ROUTE-DEDI-USA-NY dst counter packets 0 bytes 0 jump VYATTA_PBR_20 comment "POL-ROUTE-ETH1-13"
                meta l4proto tcp # match-set ADR-ORYX src tcp dport { 119,563} counter packets 0 bytes 0 jump VYATTA_PBR_30 comment "POL-ROUTE-ETH1-20"
                # match-set ADR-ROUTE-TG-EU-UK src counter packets 0 bytes 0 jump VYATTA_PBR_30 comment "POL-ROUTE-ETH1-21"
                meta l4proto tcp # match-set ADR-ORYX src tcp dport { 119,563} counter packets 0 bytes 0 log prefix "[POL-ROUTE-ETH1-22-D] " comment "POL-ROUTE-ETH1-22"
                meta l4proto tcp # match-set ADR-ORYX src tcp dport { 119,563} counter packets 0 bytes 0 drop comment "POL-ROUTE-ETH1-22"
                counter packets 55468 bytes 48508090 return comment "POL-ROUTE-ETH1-10000 default-action accept"
        }

        chain VYATTA_PBR_20 {
                counter packets 86 bytes 5160 meta mark set 0x80000013
                counter packets 86 bytes 5160 accept
        }

        chain VYATTA_PBR_30 {
                counter packets 0 bytes 0 meta mark set 0x8000001d
                counter packets 0 bytes 0 accept
        }
}


And then once fix applied

vyos:[~] $ sudo ip rule show
0:      from all lookup local
20:     from all fwmark 0x80000013 lookup 20
30:     from all fwmark 0x8000001d lookup 30
32766:  from all lookup main
32767:  from all lookup default


vyos:[~] $ sudo ip r show table 20
default nhid 26 dev wg2 proto static metric 20

vyos:[~] $ sudo ip r show table 30
default nhid 28 dev wg1 proto static metric 20

vyos:[~] $ sudo ip r get 208.85.40.158 mark 0x80000013
208.85.40.158 dev wg2 table 20 src 10.13.65.97 mark 0x80000013 uid 0
    cache

vyos:[~] $ sudo ip r get 95.211.189.152 mark 0x8000001d
95.211.189.152 dev wg1 table 30 src 10.13.108.177 mark 0x8000001d uid 0
    cache

table ip mangle {
        chain PREROUTING {
                type filter hook prerouting priority mangle; policy accept;
                counter packets 194554 bytes 106482687 jump VYATTA_FW_IN_HOOK
        }

        chain INPUT {
                type filter hook input priority mangle; policy accept;
        }

        chain FORWARD {
                type filter hook forward priority mangle; policy accept;
        }

        chain OUTPUT {
                type route hook output priority mangle; policy accept;
                counter packets 3265 bytes 330833 jump VYATTA_FW_LOCALOUT_HOOK
        }

        chain POSTROUTING {
                type filter hook postrouting priority mangle; policy accept;
                counter packets 191178 bytes 105852454 jump VYATTA_FW_OUT_HOOK
        }

        chain VYATTA_FW_OUT_HOOK {
        }

        chain VYATTA_FW_IN_HOOK {
                iifname "eth1" counter packets 103195 bytes 87881467 jump POL-ROUTE-ETH1
        }

        chain VYATTA_FW_LOCALOUT_HOOK {
        }

        chain POL-ROUTE-ETH1 {
                # match-set NET-STREAMING-DISNEY dst counter packets 0 bytes 0 jump VYATTA_PBR_20 comment "POL-ROUTE-ETH1-10"
                # match-set NET-STREAMING-PANDORA dst counter packets 150 bytes 9000 jump VYATTA_PBR_20 comment "POL-ROUTE-ETH1-11"
                # match-set ADR-ROUTE-TG-USA-DEDI-NY src counter packets 0 bytes 0 jump VYATTA_PBR_20 comment "POL-ROUTE-ETH1-12"
                # match-set NET-ROUTE-DEDI-USA-NY dst counter packets 0 bytes 0 jump VYATTA_PBR_20 comment "POL-ROUTE-ETH1-13"
                meta l4proto tcp # match-set ADR-ORYX src tcp dport { 119,563} counter packets 0 bytes 0 jump VYATTA_PBR_30 comment "POL-ROUTE-ETH1-20"
                # match-set ADR-ROUTE-TG-EU-UK src counter packets 0 bytes 0 jump VYATTA_PBR_30 comment "POL-ROUTE-ETH1-21"
                meta l4proto tcp # match-set ADR-ORYX src tcp dport { 119,563} counter packets 0 bytes 0 log prefix "[POL-ROUTE-ETH1-22-D] " comment "POL-ROUTE-ETH1-22"
                meta l4proto tcp # match-set ADR-ORYX src tcp dport { 119,563} counter packets 0 bytes 0 drop comment "POL-ROUTE-ETH1-22"
                counter packets 103045 bytes 87872467 return comment "POL-ROUTE-ETH1-10000 default-action accept"
        }

        chain VYATTA_PBR_20 {
                counter packets 150 bytes 9000 meta mark set 0x80000013
                counter packets 150 bytes 9000 accept
        }

        chain VYATTA_PBR_30 {
                counter packets 0 bytes 0 meta mark set 0x8000001d
                counter packets 0 bytes 0 accept
        }
}

Thanks a lot!
Now we know that the problem is in routing entries that are not creating during boot for some reason.
Could you show also the output of:

sudo vtysh -c 'show running-config' | tee
sudo journalctl -b /usr/lib/frr/staticd | tee

Here is the output after a fresh reboot

vyos:[~] $ sudo vtysh -c 'show running-config' | tee
Building configuration...

Current configuration:
!
frr version 7.5.1-20210801-00-g8bed329e4
frr defaults traditional
hostname vyos
log syslog
log facility local7
service integrated-vtysh-config
!
ip route 0.0.0.0/0 pppoe0
!
line vty
!
end

vyos:[~] $ sudo journalctl -b /usr/lib/frr/staticd | tee
-- Logs begin at Thu 2021-08-26 17:02:42 UTC, end at Thu 2021-08-26 17:03:31 UTC. --
-- No entries --

Here is output once the fix has been applied

Building configuration...

Current configuration:
!
frr version 7.5.1-20210801-00-g8bed329e4
frr defaults traditional
hostname vyos
log syslog
log facility local7
service integrated-vtysh-config
!
ip route 0.0.0.0/0 wg1 table 30
ip route 0.0.0.0/0 wg2 table 20
ip route 0.0.0.0/0 pppoe0
!
line vty
!
end

vyos:[~] $ sudo journalctl -b /usr/lib/frr/staticd | tee
-- Logs begin at Thu 2021-08-26 15:14:48 UTC, end at Thu 2021-08-26 17:02:06 UTC. --
-- No entries --

does some script run when wg interface goes up? It might only handle main route table.
Create some dummy route on wg interface (for example, to 1.1.1.1/32 ), and see if this dummy route is present after start-up

The problem is clear - routing tables 10 and 20 are not presented in the FRR. The question is: why?
I would expect to see at least anything in logs that may show the reasons, but there are empty.

I tested again with openvpn same result doesnt matter if its wireguard or not tables arent created at reboot. Also tried with just one table.

1.2.7 seems to handle it fine:

set protocols static interface-route 0.0.0.1/32 next-hop-interface wg0
set protocols static table 2 interface-route 0.0.0.2/32 next-hop-interface wg0

After reboot:

vyos@vyos:~$ sudo ip r s t 2
0.0.0.2 dev wg0 proto static metric 20
vyos@vyos:~$ sudo ip r s
default via 10.31.76.249 dev eth0 proto ospf metric 20
0.0.0.1 dev wg0 proto static metric 20

Seems obvious, but a simple reminder… Are you commiting and saving changes?