Failed to install Nexthop (216[10.0.0.34 if 12 vrfid 0]) into the kernel

Hello!

I have a redundant vpn-connection between

  • my on-prem OPNSense, and
  • my Datacenter-VyOS.

There are two tunnels between them:

  • 1x IPSec (vti)
  • 1x Wireguard

→ controlled by OSPF. The IPsec tunnel is intended to be the primary tunnel, with the WireGuard tunnel as backup.

Almost every two to three days, the IPsec tunnel encounters an issue. It’s not the tunnel itself that fails, but rather the connected route on the VTI interface that disappears!

Oct 15 04:57:56 zebra[1531]: [VYKYC-709DP] default(0:254):10.0.0.32/30: Route install failed
Oct 15 04:57:56 zebra[1531]: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (216[10.0.0.34 if 12 vrfid 0]) into the kernel

Since the tunnel itself remains active, IPsec’s DPD doesn’t engage. The VTI interface gets stuck in a down state, causing OSPF to switch over to the secondary tunnel.

manuel@mvr02# run sho int | match vti
vti0         -                        n/a                default   1500  u/u    ocloud-net2
vti1         10.0.0.33/30             n/a                default   1400  A/D    IPsec fw1int

I can easily resolve this by typing:

manuel@mvr02:~$ restart ipsec

…but I suspect the route will disappear again soon.


Running System-Image: 1.5-rolling-202409250007

Relevant config-parts:

set interfaces vti vti1 address '10.0.0.33/30'
set interfaces vti vti1 mtu '1400'

set interfaces wireguard wg2 address '10.0.0.25/30'
set interfaces wireguard wg2 mtu '1420'
<the usual working wireguard-stuff ...>

set protocols bfd peer 10.0.0.34 profile 'home-ipsec'
set protocols bfd profile home-ipsec

set protocols ospf area 0 network '10.0.0.24/30'
set protocols ospf area 0 network '10.0.0.32/30'
set protocols ospf interface vti1 bfd profile 'home-ipsec'
set protocols ospf interface vti1 cost '60'
set protocols ospf interface wg2 cost '1080'

set vpn ipsec ike-group generic-28800-v2 close-action 'trap'
set vpn ipsec ike-group generic-28800-v2 dead-peer-detection action 'trap'
set vpn ipsec ike-group generic-28800-v2 dead-peer-detection interval '30'
set vpn ipsec ike-group generic-28800-v2 dead-peer-detection timeout '120'
set vpn ipsec ike-group generic-28800-v2 disable-mobike
set vpn ipsec ike-group generic-28800-v2 key-exchange 'ikev2'
set vpn ipsec ike-group generic-28800-v2 lifetime '28800'
<some fitting ike-proposals and esp-groups>

set vpn ipsec site-to-site peer fw1int authentication local-id 'mvr02'
set vpn ipsec site-to-site peer fw1int authentication mode 'pre-shared-secret'
set vpn ipsec site-to-site peer fw1int authentication remote-id 'fw1int'
set vpn ipsec site-to-site peer fw1int connection-type 'initiate'
set vpn ipsec site-to-site peer fw1int force-udp-encapsulation
set vpn ipsec site-to-site peer fw1int ike-group 'generic-28800-v2'
set vpn ipsec site-to-site peer fw1int ikev2-reauth 'inherit'
set vpn ipsec site-to-site peer fw1int local-address '87.106.230.131'
set vpn ipsec site-to-site peer fw1int remote-address 'any'
set vpn ipsec site-to-site peer fw1int vti bind 'vti1'
set vpn ipsec site-to-site peer fw1int vti esp-group 'generic-3600-dh14'

Does sudo dmesg give any other details/information as to what might be causing the VTI interface to go down?

No, unfortunately nothing usefull. Only log-rotate.

[Tue Oct 15 01:43:20 2024] systemd-journald[613]: Data hash table of /var/log/journal/558920eae9c342f8a9ed395a1765fca8/system.journal has a fill level at 75.0 (165964 of 221283 items, 58720256 file size, 353 bytes per hash table item), suggesting rotation.
[Tue Oct 15 01:43:20 2024] systemd-journald[613]: /var/log/journal/558920eae9c342f8a9ed395a1765fca8/system.journal: Journal header limits reached or header out-of-date, rotating.
[Tue Oct 15 14:41:41 2024] systemd-journald[613]: Data hash table of /var/log/journal/558920eae9c342f8a9ed395a1765fca8/system.journal has a fill level at 75.0 (165963 of 221283 items, 58720256 file size, 353 bytes per hash table item), suggesting rotation.
[Tue Oct 15 14:41:41 2024] systemd-journald[613]: /var/log/journal/558920eae9c342f8a9ed395a1765fca8/system.journal: Journal header limits reached or header out-of-date, rotating.

hi ,

try to change those paramerts :

set vpn ipsec ike-group generic-28800-v2 close-action 'none'
set vpn ipsec ike-group generic-28800-v2 dead-peer-detection action 'restart'

confirm that this commands is already configure set vpn ipsec options disable-route-autoinstall

1 Like

Hi fernando! Ok, I’ll give it a try :+1:
Thanks! Now I have to wait a couple of days to see if it is stable.

set vpn ipsec options disable-route-autoinstall is already running - I just forgot to mention it at my “relevant config” :stuck_out_tongue_winking_eye:

2 Likes

Unfortunately the issue was not resolved by changing the ikev2-actions. Today in the morning the tunnel failed again, after a DDNS-IP-Change by the other side. The parallel working Wireguard-Tunnel works fine, and OSPF switches over.

But now I found a little more evidence in the Logfile:

Nov 04 04:58:15 zebra[1534]: [HSYZM-HV7HF] Extended Error: Nexthop device is not up
Nov 04 04:58:15 zebra[1534]: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Network is down, type=RTM_NEWNEXTHOP(104), seq=102, pid=3371087001
Nov 04 04:58:15 zebra[1534]: [HSYZM-HV7HF] Extended Error: Nexthop id does not exist
Nov 04 04:58:15 zebra[1534]: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWROUTE(24), seq=103, pid=3371087001
Nov 04 04:58:15 zebra[1534]: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (50[if 12 vrfid 0]) into the kernel
Nov 04 04:58:15 zebra[1534]: [VYKYC-709DP] default(0:254):10.0.0.32/30: Route install failed

Indeed VyOS tells me, that the interface is Admin-Down:

manuel@mvr02# run show interfaces | match vti0
vti0         -                        n/a                default   1500  u/u    ocloud-net2

… but actually this isn’t right after all

manuel@mvr02# show interfaces vti vti1
 address 10.0.0.33/30
 description "IPsec fw1int"
 mtu 1400
[edit]
manuel@mvr02#

And once again restart ipsec healed both the vti-interface, and the IPsec-Tunnel. Now I’m running version 1.5-rolling-202411030007.

Next I’ll try to remove the OSPF-component, and use only static-routing.

This topic was automatically closed after 14 days. New replies are no longer allowed.