Forwarding / PMTUD / fragmentation issue

Hi, I have a networking issue that’s driving me crazier than usual. Thanks in advance for any help anyone can give me.

I have a VPS running a recent VyOS 1.4 rolling release at a well-known cloud provider. This VPS has a virtio network interface (eth0) and I have VyOS installed and configured to forward accepted traffic from the internet over a wireguard tunnel to another server at a different location. I do not have access to the host on which my VPS runs. For most traffic things work fine but certain TCP connections on my downstrem servers hang.

After taking packet captures of traffic that works and traffic that doesn’t what appears to be happening is this:

  • Packets coming into eth0 and reaching VyOS’ network stack often seem to be very (impossibly) large. Captures taken on the server show many packets which are often 10k or larger which is obviously larger than could have possibly come over the wire (and much larger than the 1500 MTU on the interface). Perhaps this is because of some sort of packet receive offloading that the virtio driver may be doing?

  • Nevertheless, things generally seem to work. What I assume is happening is that VyOS is fragmenting these weird large packets to fit the MTU of the wireguard interface before forwarding. Packet captures on the server on the other side of the wireguard tunnel (a different VyOS) show that it is receiving normal-sized packets that fit into the MTU of the tunnel.

  • However there are some of these large packets / connections where things don’t work and the server on the other side of the wireguard tunnel never receives the data from the large packets. I believe the problem happens only for packets which have the “Don’t Fragment” bit set.

  • I notice in packet captures that for these packets which don’t get forwarded properly that VyOS sends a “Destination Unreachable / Fragmentation Needed” ICMP message to the source system. However, this ICMP message references the received packets with the impossibly huge size. In essence VyOS seems to be sending the source system a message saying that they sent an impossibly huge packet and they should fragment. But of course the source system could not really have sent the referenced impossibly huge packet so I’d guess it is simply ignoring these ICMP messages. I do notice that the packets are being retransmitted in smaller packets (1500 bytes) matching the MTU of the external interface but the retransmitted packets are still bigger than the MTU of the next hop specified in the ICMP message (1420 bytes) so the retransmitted packets still fail.

  • Eventually, after many retransmits by the source system and many more ICMP messages telling it to fragment from VyOS the source system gives up.

The first thing I don’t understand is why does my VyOS server appear to be receiving these impossibly large packets? I thought it could be because of some sort of receive offloading but according to VyOS and ethtool no offloading features at all were enabled for the interface. (I tried enabling these features and it made no difference and yielded the exact same behavior.)

Judging from the logs on my downstream servers I’ve been having this problem for a long time… Since Sept 12. Coincidentally (or not?) that is 4 or 5 days after I switched from OPNsense to VyOS (both running on the same VPS).

This screenshot shows VyOS’ Destination Unreachable ICMP response referring to the one of the problematic large packets:

I suppose a workaround might be to forcibly clear the “Don’t Fragment” flag on all received packets. Is that possible to do on VyOS? (But I still want to know exactly why I seem to be receiving these impossibly large packets in the first place…)

Anyone have any ideas why this is happening or what else I can try?

I’d start setting mss-clamp on wireguard interface. Even if you do still get such “jumbo” packet, subsequent packets should fit the wg tunnel