GRE Tunnel fragmentation needed despite fitting MTU size

Hello,

I’m running two routers that are connected via IPSec like in this documentation with the GRE tunnel.
I had a problem where some connections from a device behind one router, to the internet routed through the other router, like ping or http (sometimes) would work but others like https would fail.
I tracked this down to a MTU issue along the path - specifically when the Sever Hello of the TLS negotiation would not reach the client on the way back.
I then looked at the tun0 interface with Wireshark on the router connected to the internet and can see ICMPs like this (192.168.250.2 is the loopback address):

23	450.734457	192.168.250.2	192.168.250.2	ICMP	590	Destination unreachable (Fragmentation needed)

Even though the original packed did not have the DF bit set, the GRE packet does have it set.
The GRE Packet is 1476 bytes in length which is the same as the MTU Setting on the interface. Behaviour is no different whether this value is set by hand or the default is used.
In reality the biggest packet I could send was 1414 bytes:

dracotomes@client:~$ ping google.de -s 1385
PING google.de (142.251.209.131) 1385(1413) bytes of data.
76 bytes from ham11s07-in-f3.1e100.net (142.251.209.131): icmp_seq=1 ttl=116 (truncated)
76 bytes from ham11s07-in-f3.1e100.net (142.251.209.131): icmp_seq=2 ttl=116 (truncated)
^C
--- google.de ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 21.324/21.503/21.683/0.179 ms

dracotomes@client:~$ ping google.de -s 1386
PING google.de (142.251.209.131) 1386(1414) bytes of data.
76 bytes from ham11s07-in-f3.1e100.net (142.251.209.131): icmp_seq=1 ttl=116 (truncated)
76 bytes from ham11s07-in-f3.1e100.net (142.251.209.131): icmp_seq=2 ttl=116 (truncated)
^C
--- google.de ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 21.340/21.372/21.404/0.032 ms

dracotomes@client:~$ ping google.de -s 1387
PING google.de (142.251.209.131) 1387(1415) bytes of data.
^C
--- google.de ping statistics ---
6 packets transmitted, 0 received, 100% packet loss, time 5098ms

The version on both routers is:

Version:          VyOS 1.5-rolling-202310240118
Release train:    current

Built by:         autobuild@vyos.net
Built on:         Tue 24 Oct 2023 02:41 UTC
Build UUID:       66471dfc-8348-4dbf-953e-133854c539ff
Build commit ID:  142e8770563716

Architecture:     x86_64
Boot via:         installed image
System type:      KVM guest

Hardware vendor:  QEMU
Hardware model:   Standard PC (i440FX + PIIX, 1996)
Hardware S/N:
Hardware UUID:    df3bd66b-a30a-42e2-a824-57439af15abb

Copyright:        VyOS maintainers and contributors

I have set the MTU size on the tun0 Interfaces to 8024 and everything seems to work right now.

I feel like this should work out of the box, or did I make a mistake in the configuration?

this issue seem to be related TCP-MSS Clampin configuration , it can be configured in our cli . try the following :

VyOS 1.3
set firewall options interface tun0 adjust-mss '1369'

I do not seem to have this option in 1.5.

yes, it’s different the syntax 1.4/1.5 :

set interfaces tunnel <interface> ip adjust-mss <mss | clamp-mss-to-pmtu>

https://docs.vyos.io/en/latest/configuration/interfaces/tunnel.html#cfgcmd-set-interfaces-tunnel-interface-ip-adjust-mss-mss-clamp-mss-to-pmtu

Playing around with the values did not yield any results for me. Still getting Fragmentation needed.

each additional protocol add overhead in the packet ip , if we use gre to transport additional overhead is added and those message don’t fragmentation are showed , in fact protocols like icmp can do fragmentation and you don’t see any issues. here’s explain this behavior :

I’m aware of the GRE overhead. This is why the MTU size on the tun0 interface is set to 1476 by default.
It seems that in my case the packet doesn’t get delivered even though the packet is the correct size:


This is the captured output of the loopback Interface on the internet side when running

curl https://google.de

from a device behind the other router.

Also sometimes these curls will suddently start working for a short time and then stop working again.

So, you should set TCP-MSS Clamping to 1436 on both tunnel ends.

Which I did but it did not improve the behaviour, as pointed out in my previous post.

Yes, I know, but you didn’t write what values ​​you used.

You use IPSec, so MMS will be much lower than 1436.

https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA10g000000ClW3CAK
https://www.cisco.com/c/en/us/support/docs/ip/generic-routing-encapsulation-gre/25885-pmtud-ipfrag.html

Personally, I use GRE over IPSec and do not set any MTU or TCP-MSS anywhere. And everything works (on 1.3).

As debug try MTU 1280 or MSS 1240 for the tunnel at both ends (MTU 1280 bytes is the lowest allowable MTU for IPv6, the equal for IPv4 is 576 bytes).

If that works then its a MTU/MSS issue - if it still doesnt work then the error is elsewhere.

I’d use a ping sweep, having variable size, with DF set to determine max mtu.
Take into account that ping itself takes 28 bytes, so max value of 1472 is seen for 1500 bytes.
Then set proper mtu on gre tunnel, and set mss-clamp 40 lower than MTU

MSS will not “fix” ICMP packets.

MSS is TCP specific.

Yes, but the pinsweep with DF-bit set will tell you where the limit is.

That is a ping who succeeds through the tunnel at 1472 bytes means with headers its 1500 MTU that can pass through.

Then from the firewall point of view (unless it supports virtual re-assembly) only thing you can do is to set proper MTU AND proper MSS.