VyOS discard TTL Exceeded

Dear Community,

I am not sure if it is a bug, or this is a normal behavior, but I noticed that VyOS(VyOS 1.3), I not always replying with “time to live exceeded”. This is causing huge packet loss in MTRs and sometimes confuses other people and led to a lot of question about the performance/issues and packet loss…

Here is what I see:

 Host                                                                                                                                             Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. 10.10.1.3 (10.10.1.3)                                                                                                                   90.2%   296    0.1   0.1   0.1   0.2   0.0
 2. ash-b1-link.telia.net (62.115.191.177)                                                                                                         0.0%   296    0.9   0.7   0.6   2.9   0.3
 3. ash-bb2-link.ip.twelve99.net (62.115.143.120)                                                                                                  0.0%   296    0.8   0.9   0.8   3.4   0.2
 4. ash-b2-link.ip.twelve99.net (62.115.123.125)                                                                                                   0.0%   296    1.2   3.3   1.0  59.7   8.8
 5. google-ic373139-ash-b2.ip.twelve99-cust.net (62.115.145.225)                                                                                   1.4%   296    3.3   3.8   2.1 220.7  12.9
 6. 209.85.250.193 (209.85.250.193)                                                                                                                0.0%   295    0.6   1.1   0.6 132.2   7.8
 7. 209.85.246.81 (209.85.246.81)                                                                                                                  0.0%   295    1.6   3.5   1.1 177.2  14.2
 8. dns.google (8.8.8.8)                                                                                                                           0.0%   295    0.6   2.4   0.5 272.7  17.6

At the same time, I am pinging the gateway just fine:
— 10.10.1.3 ping statistics —
332 packets transmitted, 332 received, 0% packet loss, time 338947ms
rtt min/avg/max/mdev = 0.050/0.101/0.147/0.017 ms

I don’t have firewall config on the VyOS at all:

vyos1:~$ show firewall

-----------------------------
Rulesets Information
-----------------------------

acops@iad1-p1-rtr02:~$ show firewall statistics

-----------------------------
Rulesets Information
-----------------------------

vyos1:~$ show firewall summary

------------------------
Firewall Global Settings
------------------------

------------------------
Firewall Rulesets
------------------------

------------------------
Firewall Groups
------------------------

    bonding bond1 {
        hash-policy layer2
        lacp-rate fast
        member {
            interface eth4
            interface eth5
        }
        mode 802.3ad
        mtu 9216
        vif 100 {
            address 10.10.1.3/23
        }
    }

Am I missing anything ?

" * " means the probe timed out
Check the router resources
Check dmesg “sudo dmesg -T”

Thus were added while adding the traceroute details and trying to mark first line as BOLD.
There are no stars in the traceroute at all.
I edited my post to first message to avoid confusion.

dmesg shows nothing:

[Fri Feb 10 10:32:36 2023] IPv4: martian source 0.0.0.255 from 127.10.10.23, on dev bond0.3500
[Fri Feb 10 10:32:36 2023] ll header: 00000000: ff ff ff ff ff ff 00 e0 0c 02 00 fd 08 00
[Fri Feb 10 10:32:36 2023] IPv4: martian source 0.0.0.255 from 127.10.10.23, on dev bond0.3500
[Fri Feb 10 10:32:36 2023] ll header: 00000000: ff ff ff ff ff ff 00 e0 0c 02 00 fd 08 00
[Fri Feb 10 10:32:36 2023] IPv4: martian source 0.0.0.255 from 127.10.10.23, on dev bond0.3500
[Fri Feb 10 10:32:36 2023] ll header: 00000000: ff ff ff ff ff ff 00 e0 0c 02 00 fd 08 00
[Fri Feb 10 10:32:36 2023] IPv4: martian source 0.0.0.255 from 127.10.10.23, on dev bond0.3500
[Fri Feb 10 10:32:36 2023] ll header: 00000000: ff ff ff ff ff ff 00 e0 0c 02 00 fd 08 00
[Fri Feb 10 10:32:36 2023] IPv4: martian source 0.0.0.255 from 127.10.10.23, on dev bond0.3500
[Fri Feb 10 10:32:36 2023] ll header: 00000000: ff ff ff ff ff ff 00 e0 0c 02 00 fd 08 00
[Fri Feb 10 10:32:36 2023] IPv4: martian source 0.0.0.255 from 127.10.10.23, on dev bond0.3500
[Fri Feb 10 10:32:36 2023] ll header: 00000000: ff ff ff ff ff ff 00 e0 0c 02 00 fd 08 00
[Fri Feb 10 10:32:36 2023] IPv4: martian source 0.0.0.255 from 127.10.10.23, on dev bond0.3500
[Fri Feb 10 10:32:36 2023] ll header: 00000000: ff ff ff ff ff ff 00 e0 0c 02 00 fd 08 00
[Fri Feb 10 10:32:36 2023] IPv4: martian source 0.0.0.255 from 127.10.10.23, on dev bond0.3500
[Fri Feb 10 10:32:36 2023] ll header: 00000000: ff ff ff ff ff ff 00 e0 0c 02 00 fd 08 00
[Fri Feb 10 10:32:36 2023] IPv4: martian source 0.0.0.255 from 127.10.10.23, on dev bond0.3500
[Fri Feb 10 10:32:36 2023] ll header: 00000000: ff ff ff ff ff ff 00 e0 0c 02 00 fd 08 00
[Fri Feb 10 10:32:36 2023] IPv4: martian source 0.0.0.255 from 127.10.10.23, on dev bond0.3500
[Fri Feb 10 10:32:36 2023] ll header: 00000000: ff ff ff ff ff ff 00 e0 0c 02 00 fd 08 00

VyOS is working in a very light mode, no load at all:

Here is another MTR that I’ve run 1 min ago:

 Host                                                                                                                                                                                             Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. 10.10.1.3                                                                                                                                                                                     87.6%   210    0.1   0.2   0.1   0.2   0.0
 2. 15169-dc10.equinix.com                                                                                                                                                                         0.0%   210    0.4   1.6   0.3 122.1   9.3
 3. 108.170.246.1                                                                                                                                                                                  0.0%   210    0.5   0.5   0.4   0.8   0.1
 4. 142.251.70.85                                                                                                                                                                                  0.0%   210    0.5   0.5   0.4   0.6   0.1
 5. dns.google                                                                                                                                                                                     0.0%   210    0.4   0.4   0.3   0.5   0.0

Anybody have any ideas on why it is happening and how it could be solved ? I even run some pcaps that unfortunately I can’t share, but I clearly see that it simply doesn’t reply to all the packets with “ICMP ttl exceeded”…

I believe Linux rate limits TTL Exceeded by default: https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt

Search for “Time Exceeded *”

I figure with a bit more research you can figure out how to disable/turn off that rate limit using sysctl tuning.

@tjh thank you very much, after changing the net.ipv4.icmp_ratemask to 4120, it got fixed.

1 Like

Excellent! Glad to hear you got your issue sorted.