I think I’ve narrowed this down to an issue with container networking. I’ve got this container config:
set container name nginx image 'docker.io/library/nginx:mainline-alpine'
set container name nginx network nginx address '172.17.1.2'
set container name nginx restart 'always'
set container network nginx prefix '172.17.1.0/24'
If I remove the container config, PBR works correctly. That is, no NAT on the policy routed traffic.
Here’s the nftables ip nat
table:
table ip nat {
chain VYOS_PRE_SNAT_HOOK {
type nat hook postrouting priority srcnat - 1; policy accept;
return
}
chain NETAVARK-5BD504A99B1D3 {
ip daddr 172.17.1.0/24 counter packets 0 bytes 0 accept
ip daddr != 224.0.0.0/4 counter packets 0 bytes 0 masquerade
}
chain POSTROUTING {
type nat hook postrouting priority srcnat; policy accept;
counter packets 161 bytes 11326 jump NETAVARK-HOSTPORT-MASQ
ip saddr 172.17.1.0/24 counter packets 1 bytes 64 jump NETAVARK-5BD504A99B1D3
}
chain NETAVARK-HOSTPORT-SETMARK {
counter packets 0 bytes 0 meta mark set mark or 0x2000
}
chain NETAVARK-HOSTPORT-MASQ {
meta mark & 0x00002000 == 0x00002000 counter packets 31 bytes 2141 masquerade
}
chain NETAVARK-HOSTPORT-DNAT {
}
chain PREROUTING {
type nat hook prerouting priority dstnat; policy accept;
fib daddr type local counter packets 87 bytes 5156 jump NETAVARK-HOSTPORT-DNAT
}
chain OUTPUT {
type nat hook output priority dstnat; policy accept;
fib daddr type local counter packets 0 bytes 0 jump NETAVARK-HOSTPORT-DNAT
}
}
See how the NETAVARK-HOSTPORT-MASQ
chain is turning on masquerade
if mark bit 0x2000
is set? And from above, PBR is using 0x7fffff01
, which includes that bit. And every time I start a new TCP connection that’s routed by PBR, the packet counter for that masquerade
rule increments by 1.