My policy route works in 1.3.0 but not in 1.3.2

I’m labbing a scenario and have run into some sort of a bug but I’m not clear on how best to proceed in narrowing this down/resolving it where I have a policy route used to next-hop specific traffic over to our site to site VPN. The policy route works perfectly in 1.3.0 but doesn’t in 1.3.2. I’ve tried on a system that was running 1.3.0 and then updated to 1.3.2 and also on a fresh system installed with 1.3.2 and both fail.

Here’s all the players:
Test system: eth0: 10.6.0.10 untagged

L3 Switch1: eth0.2030 172.16.6.70/28 # used for internal links between Switch, Router and VPN
L3 Switch1: eth1: 10.6.0.1 vlan untagged # gateway for 10.6.0.0/24

VPN Server: eth0.2030 172.16.6.72/28

Router1: eth0.2010 4.4.4.2/24 # WAN
Router1: eth0.2030 172.16.6.65/28

set interfaces ethernet eth0 vif 2010 address '4.4.4.3/24'
set interfaces ethernet eth0 vif 2030 address '172.16.6.66/28'
set interfaces ethernet eth0 vif 2030 policy route 'VPN'
set policy route VPN rule 1004 destination address '10.3.0.0/16'
set policy route VPN rule 1004 set table '10'
set policy route VPN rule 1004 source address '10.6.0.0/16'
set protocols static route 0.0.0.0/0 next-hop 4.4.4.1
set protocols static route 10.6.0.0/16 next-hop 172.16.6.70
set protocols static table 10 route 0.0.0.0/0 next-hop 172.16.6.72

Testing is done via a simple ping from a container with IP 10.6.0.10 to 10.3.0.10 which fails and 10.6.0.10 received a “net unreachable” from my WAN simulators IP 4.4.4.1.
10.6.0.10 > 10.6.0.1(Switch) > default route to 172.16.6.66(Router) > PBR next-hops to VPN at 172.16.6.70
The problem is my ping from 10.6.0.10 is going out the Router default route to 4.4.4.1 which obviously doesn’t know what to do with it.

Any help or insight would be appreciated!

Will it work if you delete destination address from the policy route?

Just tested removing the destination address from my policy route config and it had no affect.

Additional tests done, none of which worked:
Removed destination address to my policy route
Re-added destination address to my policy route
Added additional rules to match any local IP to local IP traffic
Added additional rule to match all ICMP traffic

Enabled logging on each and on the default drop and I never see the ping attempts show up in the logs. I’m stumped here.

Require more tests
But as workaround could you remote policy route from the interface 2030
and use

set policy local-route rule 100 set table '10'
set policy local-route rule 100 source '10.6.0.0/16'

Unfortunately I’ve also tried that, having the VPN policy route applied to that interface as well with no effect.

Something wrong with your config or topology, interfaces connections (maybe DNAT or something else)
policy local-route works should always work and you don’t need to attach any other policy
All source traffic with source 10.6.0.0/16 must go via table 10
Try to dump traffic on interfaces

sudo ip rule show
sudo ip route show table 10

Set some 10.6.x.x to the local router and ping with this source address

My apologies, I misread your last comment. I did not try a local-route option before. I just did and that’s allowing traffic to pass through so that’s progress.
I don’t see an option to specify a destination however on the policy local_route so right now everything sourced from 10.6.X.X is going out that table which next-hops to our VPN server. Is there another options to fitter this?

Match destination will be available in the next LTS release 1.3.3

Are you sure no sNAT is done in the middle?
While pinging from host/container 10.6.0.10 to 10.3.0.10, do some tcpdump in the router in order to check router is receiving icmp with proper source address.

sudo tcpdump -i eth0.2030 icmp

Sorry for the hiatus and also for missing mentioning or including info on sNAT. I do have sNAT configured and my taffic is hitting it. I tried exempting that traffic and i see the rule hit but the traffic still doesn’t pass, it’s being sent out eth0.2010 still, just un-NAT’d as 10.6.0.10, so it’s still ignoring my PBR.

set nat source rule 900 destination address '10.3.0.0/16'
set nat source rule 900 exclude
set nat source rule 900 log 'enable'
set nat source rule 900 outbound-interface 'any'
set nat source rule 900 source address '10.6.0.0/16'
set nat source rule 999 log 'enable'
set nat source rule 999 outbound-interface 'eth0.2010'
set nat source rule 999 source address '10.6.0.0/16'
set nat source rule 999 translation address '4.4.4.3'

Logs: first line is before adding the rule 900 to exempt, and 2nd line is after adding the exemption.


kernel: [4161512.180401] [NAT-SRC-999] IN= OUT=eth0.2010 SRC=10.6.0.10 DST=10.3.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=56441 DF PROTO=ICMP TYPE=8 CODE=0 ID=134 SEQ=1

kernel: [4161759.009979] [NAT-SRC-900-EXCL] IN= OUT=eth0.2010 SRC=10.6.0.10 DST=10.3.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=26815 DF PROTO=ICMP TYPE=8 CODE=0 ID=139 SEQ=1

Bump.
Any advice out there for how I can resolve this?

Can you send output of show ip route 10.3.0.10

Here you go

Routing entry for 0.0.0.0/0
  Known via "static", distance 1, metric 0, best
  Last update 02w0d09h ago
  * 4.4.4.1, via eth0.2010, weight 1

Also ran sh ip route and that output is below:

Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup

S>* 0.0.0.0/0 [1/0] via 4.4.4.1, eth0.2010, weight 1, 02w0d09h
C * 4.4.4.0/24 is directly connected, eth0.2010v20, 08w6d16h
C>* 4.4.4.0/24 is directly connected, eth0.2010, 08w6d16h
S>* 10.6.0.0/16 [1/0] via 172.16.6.70, eth0.2030, weight 1, 08w6d16h
C * 172.16.6.64/28 is directly connected, eth0.2030v22, 08w6d16h
C>* 172.16.6.64/28 is directly connected, eth0.2030, 08w6d16h

Seems you also have VRRP configuration with RFC compatibility:

C * 172.16.6.64/28 is directly connected, eth0.2030v22, 08w6d16h

Are you using virtual IP assigned to such interface for routing? If possible, try removing rfc-compatibility optino in VRRP. There are known issues with VRRP + RFC-Compatibility + Firewall/PBR

Correct, I just removed rfc compatibility and retested, I’m still not seeing traffic pass though.

Another note, when removing the rfc compatibility, routing breaks entirely. Before removing I could successfully ping from 10.6.0.10 to 4.4.4.1 but once removed, pings fail.
Do you have any details on the known issue so I can get eyes on it and see if there’s a workaround?

Please share full configuration, so we get it completely and check what is going on, and if there is a bug or not.

I omitted firewall groups/rules obscured (XX) any public IP’s. Please let me know if you need anything else.

set firewall all-ping 'enable'
set firewall broadcast-ping 'disable'
set firewall config-trap 'disable'
set firewall ipv6-receive-redirects 'disable'
set firewall ipv6-src-route 'disable'
set firewall ip-src-route 'disable'
set firewall log-martians 'enable'
set firewall receive-redirects 'disable'
set firewall send-redirects 'enable'
set firewall source-validation 'disable'
set firewall syn-cookies 'enable'
set firewall twa-hazards-protection 'disable'
set high-availability vrrp group FW2SWVRRP hello-source-address '172.16.6.66'
set high-availability vrrp group FW2SWVRRP interface 'eth0.2030'
set high-availability vrrp group FW2SWVRRP peer-address '172.16.6.67'
set high-availability vrrp group FW2SWVRRP priority '130'
set high-availability vrrp group FW2SWVRRP rfc3768-compatibility
set high-availability vrrp group FW2SWVRRP virtual-address 172.16.6.65/28
set high-availability vrrp group FW2SWVRRP vrid '22'
set high-availability vrrp group FWINTVRRP hello-source-address '4.4.4.3'
set high-availability vrrp group FWINTVRRP interface 'eth0.2010'
set high-availability vrrp group FWINTVRRP peer-address '4.4.4.4'
set high-availability vrrp group FWINTVRRP priority '130'
set high-availability vrrp group FWINTVRRP rfc3768-compatibility
set high-availability vrrp group FWINTVRRP virtual-address 4.4.4.2/24
set high-availability vrrp group FWINTVRRP vrid '20'
set high-availability vrrp group FWNEWPUBVRRP hello-source-address 'XX.XX.98.251'
set high-availability vrrp group FWNEWPUBVRRP interface 'eth0.1000'
set high-availability vrrp group FWNEWPUBVRRP peer-address 'XX.XX.98.252'
set high-availability vrrp group FWNEWPUBVRRP priority '130'
set high-availability vrrp group FWNEWPUBVRRP rfc3768-compatibility
set high-availability vrrp group FWNEWPUBVRRP virtual-address XX.XX.98.1/24
set high-availability vrrp group FWNEWPUBVRRP vrid '23'
set high-availability vrrp sync-group syncgrp member 'FW2SWVRRP'
set high-availability vrrp sync-group syncgrp member 'FWINTVRRP'
set high-availability vrrp sync-group syncgrp member 'FWNEWPUBVRRP'
set interfaces ethernet eth0 vif 1000 address 'xx.xx.98.251/24'
set interfaces ethernet eth0 vif 1000 ip proxy-arp-pvlan
set interfaces ethernet eth0 vif 1000 policy route 'VPN'
set interfaces ethernet eth0 vif 2010 address '4.4.4.3/24'
set interfaces ethernet eth0 vif 2010 firewall in name 'WAN_IN'
set interfaces ethernet eth0 vif 2010 firewall local name 'VyOS_MANAGEMENT'
set interfaces ethernet eth0 vif 2010 policy route 'VPN'
set interfaces ethernet eth0 vif 2030 address '172.16.6.66/28'
set interfaces ethernet eth0 vif 2030 policy route 'VPN'
set interfaces loopback lo
set nat destination rule 100 destination address 'xx.xx.98.24'
set nat destination rule 100 destination port '53'
set nat destination rule 100 inbound-interface 'any'
set nat destination rule 100 protocol 'tcp_udp'
set nat destination rule 100 translation address '10.6.xx.xx'
set nat destination rule 100 translation port '54'
set nat destination rule 110 destination address 'xx.xx.98.70'
set nat destination rule 110 inbound-interface 'any'
set nat destination rule 110 translation address '10.6.xx.xx'
set nat destination rule 115 destination address 'xx.xx.98.71'
set nat destination rule 115 inbound-interface 'any'
set nat destination rule 115 translation address '10.6.xx.xx'
set nat destination rule 120 destination address 'xx.xx.98.72'
set nat destination rule 120 inbound-interface 'any'
set nat destination rule 120 translation address '10.6.xx.xx'
set nat destination rule 130 destination address 'xx.xx.98.100'
set nat destination rule 130 destination port '17323'
set nat destination rule 130 inbound-interface 'any'
set nat destination rule 130 protocol 'udp'
set nat destination rule 130 translation address '10.6.xx.xx'
set nat destination rule 140 destination address 'xx.xx.98.80'
set nat destination rule 140 inbound-interface 'any'
set nat destination rule 140 translation address '10.6.xx.xx'
set nat destination rule 150 destination address 'xx.xx.98.100'
set nat destination rule 150 destination port '8080,443,8443,8843,6789'
set nat destination rule 150 inbound-interface 'any'
set nat destination rule 150 protocol 'tcp'
set nat destination rule 150 translation address '10.6.xx.xx'
set nat destination rule 160 destination address 'xx.xx.98.100'
set nat destination rule 160 destination port '11195'
set nat destination rule 160 inbound-interface 'any'
set nat destination rule 160 protocol 'udp'
set nat destination rule 160 translation address '10.6.xx.xx'
set nat source rule 100 outbound-interface 'any'
set nat source rule 100 source address '10.6.xx.xx/24'
set nat source rule 100 translation address 'xx.xx.98.24'
set nat source rule 110 destination address '!10.0.0.0/8'
set nat source rule 110 outbound-interface 'any'
set nat source rule 110 source address '10.6.xx.xx'
set nat source rule 110 translation address 'xx.xx.98.70'
set nat source rule 115 destination address '!10.0.0.0/8'
set nat source rule 115 outbound-interface 'any'
set nat source rule 115 source address '10.6.xx.xx'
set nat source rule 115 translation address 'xx.xx.98.71'
set nat source rule 120 destination address '!10.0.0.0/8'
set nat source rule 120 outbound-interface 'any'
set nat source rule 120 source address '10.6.xx.xx'
set nat source rule 120 translation address 'xx.xx.98.72'
set nat source rule 130 outbound-interface 'any'
set nat source rule 130 source address '10.6.xx.xx'
set nat source rule 130 translation address 'xx.xx.98.80'
set nat source rule 140 outbound-interface 'any'
set nat source rule 140 source address '10.6.xx.xx'
set nat source rule 140 translation address 'xx.xx.98.100'
set nat source rule 150 outbound-interface 'any'
set nat source rule 150 source address '10.6.xx.xx'
set nat source rule 150 translation address 'xx.xx.98.100'
set nat source rule 900 destination address '10.3.0.0/16'
set nat source rule 900 exclude
set nat source rule 900 outbound-interface 'any'
set nat source rule 900 source address '10.6.0.0/16'
set nat source rule 999 outbound-interface 'eth0.2010'
set nat source rule 999 source address '10.6.0.0/16'
set nat source rule 999 translation address 'xx.xx.98.1'
set policy route VPN rule 600 destination address 'xx.xx.103.221'
set policy route VPN rule 600 set table 'main'
set policy route VPN rule 600 source address 'xx.xx.98.221'
set policy route VPN rule 605 destination address 'xx.xx.103.222'
set policy route VPN rule 605 set table 'main'
set policy route VPN rule 605 source address 'xx.xx.98.222'
set policy route VPN rule 610 destination address 'xx.xx.99.221'
set policy route VPN rule 610 set table 'main'
set policy route VPN rule 610 source address 'xx.xx.98.221'
set policy route VPN rule 615 destination address 'xx.xx.99.222'
set policy route VPN rule 615 set table 'main'
set policy route VPN rule 615 source address 'xx.xx.98.222'
set policy route VPN rule 620 destination address 'xx.xx.100.221'
set policy route VPN rule 620 set table 'main'
set policy route VPN rule 620 source address 'xx.xx.98.221'
set policy route VPN rule 625 destination address 'xx.xx.100.222'
set policy route VPN rule 625 set table 'main'
set policy route VPN rule 625 source address 'xx.xx.98.222'
set policy route VPN rule 1000 destination address 'xx.xx.99.0/24'
set policy route VPN rule 1000 set table '10'
set policy route VPN rule 1000 source address '10.6.0.0/16'
set policy route VPN rule 1001 destination address '10.4.0.0/16'
set policy route VPN rule 1001 set table '10'
set policy route VPN rule 1001 source address '10.6.0.0/16'
set policy route VPN rule 1004 destination address '10.3.0.0/16'
set policy route VPN rule 1004 set table '10'
set policy route VPN rule 1004 source address '10.6.0.0/16'
set policy route VPN rule 1005 destination address 'xx.xx.100.0/24'
set policy route VPN rule 1005 set table '10'
set policy route VPN rule 1005 source address '10.6.0.0/16'
set policy route VPN rule 1010 destination address 'xx.xx.102.0/23'
set policy route VPN rule 1010 set table '10'
set policy route VPN rule 1010 source address '10.6.0.0/16'
set policy route VPN rule 1011 destination address '10.5.0.0/16'
set policy route VPN rule 1011 set table '10'
set policy route VPN rule 1011 source address '10.6.0.0/16'
set policy route VPN rule 1103 destination address '10.4.0.0/16'
set policy route VPN rule 1103 set table '10'
set policy route VPN rule 1103 source address 'xx.xx.98.0/24'
set policy route VPN rule 1106 destination address '10.3.0.0/16'
set policy route VPN rule 1106 set table '10'
set policy route VPN rule 1106 source address 'xx.xx.98.0/24'
set policy route VPN rule 1113 destination address '10.5.0.0/16'
set policy route VPN rule 1113 set table '10'
set policy route VPN rule 1113 source address 'xx.xx.98.0/24'
set policy route VPN rule 1120 destination address 'xx.xx.102.0/23'
set policy route VPN rule 1120 set table '10'
set policy route VPN rule 1120 source address 'xx.xx.98.0/24'
set policy route VPN rule 1125 destination address 'xx.xx.99.0/24'
set policy route VPN rule 1125 set table '10'
set policy route VPN rule 1125 source address 'xx.xx.98.0/24'
set policy route VPN rule 1130 destination address 'xx.xx.100.0/24'
set policy route VPN rule 1130 set table '10'
set policy route VPN rule 1130 source address 'xx.xx.98.0/24'
set protocols static route 0.0.0.0/0 next-hop 4.4.4.1
set protocols static route 10.6.0.0/16 next-hop 172.16.6.70
set protocols static table 10 route 0.0.0.0/0 next-hop 172.16.6.72
set service conntrack-sync accept-protocol 'tcp'
set service conntrack-sync accept-protocol 'udp'
set service conntrack-sync accept-protocol 'icmp'
set service conntrack-sync event-listen-queue-size '8'
set service conntrack-sync failover-mechanism vrrp sync-group 'syncgrp'
set service conntrack-sync interface eth0.2030 peer '172.16.6.67'
set service conntrack-sync sync-queue-size '8'
set service snmp community public authorization 'ro'
set service snmp community public client 'xx.xx.xx.xx'
set service snmp community public client 'xx.xx.xx.xx'
set service snmp contact 'xxx@xxx.xxx'
set service snmp location 'xxx, xx'
set system config-management commit-revisions '100'
set system name-server '8.8.8.8'
set system ntp server time1.vyos.net
set system ntp server time2.vyos.net
set system ntp server time3.vyos.net
set system syslog global facility all level 'info'
set system syslog global facility protocols level 'debug'

I did not replicate entire config, but so far everything I have connectivity from host 10.6.0.10 to 10.3.0.10…
With NAT and policy rules not clear (several IPs are omited), its difficult to predict if something is going on over there.

In order to see if policy rote is OK, you can verify data with next commands:

sudo ip rule
sudo nft list table ip mangle

Check counters on tables VPN and related ones wich are generated because of PBR commands.
Also some tcpdump on involved interface will help in troubleshooting