WAN Load Balancing leaking packets without NAT masquerade

So this is a weird one. Some cellular carriers enforce source IP violation rules (see bottom of page 5 for discussion), which means that if any packet makes it onto their network with a source IP other than that assigned to the modem, they disconnect the modem.

I’m having an issue with WAN load balancing where it appears that a packet is occasionally leaked onto the wwan0 interface without NAT. The weird part is that NAT seems to be working fine for packets preceding the leaked packet. Here’s a screenshot of the pcap from the wwan0 interface (.143 is the IP assigned to the WWAN interface):

The client is 10.254.254.11, and you can see that prior to the leaked packet, the client traffic was being NAT’ed. As soon as the leaked packet hits the interface, the cellular provider disconnects the connection and the modem needs to be rebooted to come back online.

To try and mitigate this, I added a redundant NAT masquerade policy (WAN load balancing should do the masquerade already):

 nat {
     source {
         rule 200 {
             outbound-interface wwan0
             source {
                 address 10.254.254.0/24
             }
             translation {
                 address masquerade
             }
         }
     }
 }

This did not help. I also tried adding an outbound firewall rule that restricts to only the interface IP:

 firewall {
     name wwan-out {
         rule 1 {
             action accept
             source {
                 address 167.XXX.XXX.143/32
             }
         }
         rule 2 {
             action drop
         }
     }
 }

When this is active, no client-outbound traffic works at all.

I have confirmed that this is the issue, since when I leave the client PC off and generate traffic from/to the internet using wwan0 on vyos itself, there is no issue.

Any ideas?

This is documented here:

https://docs.vyos.io/en/crux/configuration/nat/index.html

To quote it though

Linux netfilter will not NAT traffic marked as INVALID. This often confuses people into thinking that Linux (or specifically VyOS) has a broken NAT implementation because non-NATed traffic is seen leaving an external interface. This is actually working as intended, and a packet capture of the “leaky” traffic should reveal that the traffic is either an additional TCP “RST”, “FIN,ACK”, or “RST,ACK” sent by client systems after Linux netfilter considers the connection closed. The most common is the additional TCP RST some host implementations send after terminating a connection (which is implementation- specific).

In other words, connection tracking has already observed the connection be closed and has transition the flow to INVALID to prevent attacks from attempting to reuse the connection.

You can avoid the “leaky” behavior by using a firewall policy that drops “invalid” state packets.

Having control over the matching of INVALID state traffic, e.g. the ability to selectively log, is an important troubleshooting tool for observing broken protocol behavior. For this reason, VyOS does not globally drop invalid state traffic, instead allowing the operator to make the determination on how the traffic is handled.

1 Like

Thanks for this! Oddly enough, it seems it may not have been the problem. I added a firewall for invalid and it made no difference. Out of frustration, I removed the wan load balancing configuration altogether and this is still happening. I cannot figure it out! I will make a new thread.

In any case, I appreciate the response as this is good information to prevent cellular source ip violations.

You’re stopping invalid from leaving, right, not from entering?
Are you able to capture the details of the packets not being natted?

Yes! But I think the problem is unrelated as it happens without invalid packets even. I will make a new thread - sorry for the confusion.

All good. It would pay to show the flow of the packets, including the packet that’s not being nat’d. And also the version of Vyos that you’re using.

Cheers!

1 Like