IPSec drops, then never recovers

Hi all,

Long-time EdgeOS/VyOS user, struggling right now with intermittent IPSec drop issues with VyOS 1.3.0RC5 (though I’ve had issues across a few versions, just testing RC5 as its latest and could include fixes to my issues).

Scenario
I’ve got a couple VPNs up, each to a Ubiquiti EdgeRouter on the other end. Periodically (takes 1 day or more at times), the VPN will drop, the IPSec SA will disappear, and the VyOS router will never try to re-initiate the VPN until I explicitly ask it to (via reset vpn ipsec-peer or similar).

Environment

ryanb@ubnt01-sec:~$ show version

Version:          VyOS 1.3.0-rc5
Release Train:    equuleus

Built by:         Sentrium S.L.
Built on:         Tue 29 Jun 2021 08:26 UTC
Build UUID:       36f7c218-6ebb-497f-9ec5-676241e5c13a
Build Commit ID:  892e8689b3234e

Architecture:     x86_64
Boot via:         installed image
System type:      VMware guest

Hardware vendor:  VMware, Inc.
Hardware model:   VMware Virtual Platform
Hardware S/N:     VMware-42 39 17 9d b5 1f a4 1b-94 7f a3 b1 00 c7 51 5c
Hardware UUID:    9d173942-1fb5-1ba4-947f-a3b100c7515c

Copyright:        VyOS maintainers and contributors

VyOS IPSec configuration:

set vpn ipsec esp-group esp-azure compression 'disable'
set vpn ipsec esp-group esp-azure lifetime '3600'
set vpn ipsec esp-group esp-azure mode 'tunnel'
set vpn ipsec esp-group esp-azure pfs 'disable'
set vpn ipsec esp-group esp-azure proposal 1 encryption 'aes256'
set vpn ipsec esp-group esp-azure proposal 1 hash 'sha1'
set vpn ipsec esp-group esp-azure proposal 2 encryption 'aes256'
set vpn ipsec esp-group esp-azure proposal 2 hash 'sha256'
set vpn ipsec esp-group esp-azure proposal 3 encryption 'aes128'
set vpn ipsec esp-group esp-azure proposal 3 hash 'sha1'
set vpn ipsec ike-group ike-azure ikev2-reauth 'no'
set vpn ipsec ike-group ike-azure key-exchange 'ikev2'
set vpn ipsec ike-group ike-azure lifetime '28800'
set vpn ipsec ike-group ike-azure proposal 1 dh-group '2'
set vpn ipsec ike-group ike-azure proposal 1 encryption 'aes256'
set vpn ipsec ike-group ike-azure proposal 1 hash 'sha1'
set vpn ipsec ike-group ike-azure proposal 2 dh-group '2'
set vpn ipsec ike-group ike-azure proposal 2 encryption 'aes256'
set vpn ipsec ike-group ike-azure proposal 2 hash 'sha256'
set vpn ipsec ike-group ike-azure proposal 3 dh-group '2'
set vpn ipsec ike-group ike-azure proposal 3 encryption 'aes128'
set vpn ipsec ike-group ike-azure proposal 3 hash 'sha1'
set vpn ipsec ike-group ike-azure proposal 4 dh-group '2'
set vpn ipsec ike-group ike-azure proposal 4 encryption 'aes128'
set vpn ipsec ike-group ike-azure proposal 4 hash 'sha256'
set vpn ipsec ipsec-interfaces interface 'eth0'
set vpn ipsec logging log-level '1'
set vpn ipsec logging log-modes 'any'
set vpn ipsec logging log-modes 'ike'
set vpn ipsec logging log-modes 'esp'
set vpn ipsec logging log-modes 'net'
set vpn ipsec nat-traversal 'disable'
set vpn ipsec options disable-route-autoinstall
set vpn ipsec site-to-site peer 2.2.2.2 authentication mode 'pre-shared-secret'
set vpn ipsec site-to-site peer 2.2.2.2 authentication pre-shared-secret 'secret'
set vpn ipsec site-to-site peer 2.2.2.2 connection-type 'initiate'
set vpn ipsec site-to-site peer 2.2.2.2 default-esp-group 'esp-azure'
set vpn ipsec site-to-site peer 2.2.2.2 ike-group 'ike-azure'
set vpn ipsec site-to-site peer 2.2.2.2 ikev2-reauth 'inherit'
set vpn ipsec site-to-site peer 2.2.2.2 local-address '1.1.1.1'
set vpn ipsec site-to-site peer 2.2.2.2 vti bind 'vti10'

EdgeOS peer config:

set vpn ipsec allow-access-to-local-interface disable
set vpn ipsec auto-firewall-nat-exclude enable
set vpn ipsec disable-uniqreqids
set vpn ipsec esp-group FOO0 compression disable
set vpn ipsec esp-group FOO0 lifetime 3600
set vpn ipsec esp-group FOO0 mode tunnel
set vpn ipsec esp-group FOO0 pfs disable
set vpn ipsec esp-group FOO0 proposal 1 encryption aes256
set vpn ipsec esp-group FOO0 proposal 1 hash sha256
set vpn ipsec esp-group FOO2 compression disable
set vpn ipsec esp-group FOO2 lifetime 27000
set vpn ipsec esp-group FOO2 mode tunnel
set vpn ipsec esp-group FOO2 pfs disable
set vpn ipsec esp-group FOO2 proposal 1 encryption aes256
set vpn ipsec esp-group FOO2 proposal 1 hash sha256
set vpn ipsec ike-group FOO0 ikev2-reauth no
set vpn ipsec ike-group FOO0 key-exchange ikev2
set vpn ipsec ike-group FOO0 lifetime 28800
set vpn ipsec ike-group FOO0 proposal 1 dh-group 2
set vpn ipsec ike-group FOO0 proposal 1 encryption aes256
set vpn ipsec ike-group FOO0 proposal 1 hash sha256
set vpn ipsec ike-group FOO2 ikev2-reauth no
set vpn ipsec ike-group FOO2 key-exchange ikev2
set vpn ipsec ike-group FOO2 lifetime 28800
set vpn ipsec ike-group FOO2 proposal 1 dh-group 2
set vpn ipsec ike-group FOO2 proposal 1 encryption aes256
set vpn ipsec ike-group FOO2 proposal 1 hash sha256
set vpn ipsec site-to-site peer 1.1.1.1 authentication mode pre-shared-secret
set vpn ipsec site-to-site peer 1.1.1.1 authentication pre-shared-secret secret
set vpn ipsec site-to-site peer 1.1.1.1 connection-type respond
set vpn ipsec site-to-site peer 1.1.1.1 default-esp-group FOO2
set vpn ipsec site-to-site peer 1.1.1.1 description ipsec
set vpn ipsec site-to-site peer 1.1.1.1 ike-group FOO2
set vpn ipsec site-to-site peer 1.1.1.1 ikev2-reauth inherit
set vpn ipsec site-to-site peer 1.1.1.1 local-address 2.2.2.2
set vpn ipsec site-to-site peer 1.1.1.1 vti bind vti3
set vpn ipsec site-to-site peer 1.1.1.1 vti esp-group FOO2

Logs

7/19/2021 00:20,Info,ubnt01-sec.lan,06[IKE] <peer-2.2.2.2-tunnel-vti|13> giving up after 5 retransmits
7/19/2021 00:19,Info,ubnt01-sec.lan,%ADJCHANGE: neighbor 192.168.1.1(Unknown) in vrf default Down BGP Notification send
7/19/2021 00:19,Info,ubnt01-sec.lan,%NOTIFICATION: sent to neighbor 192.168.1.1 4/0 (Hold Timer Expired) 0 bytes 
7/19/2021 00:19,Info,ubnt01-sec.lan,05[NET] <peer-2.2.2.2-tunnel-vti|13> sending packet: from 1.1.1.1[4500] to 2.2.2.2[4500] (464 bytes)
7/19/2021 00:19,Info,ubnt01-sec.lan,05[IKE] <peer-2.2.2.2-tunnel-vti|13> retransmit 5 of request with message ID 22
7/19/2021 00:18,Info,ubnt01-sec.lan,12[NET] <peer-2.2.2.2-tunnel-vti|13> sending packet: from 1.1.1.1[4500] to 2.2.2.2[4500] (464 bytes)
7/19/2021 00:18,Info,ubnt01-sec.lan,12[IKE] <peer-2.2.2.2-tunnel-vti|13> retransmit 4 of request with message ID 22
7/19/2021 00:18,Info,ubnt01-sec.lan,08[NET] <peer-2.2.2.2-tunnel-vti|13> sending packet: from 1.1.1.1[4500] to 2.2.2.2[4500] (464 bytes)
7/19/2021 00:18,Info,ubnt01-sec.lan,08[IKE] <peer-2.2.2.2-tunnel-vti|13> retransmit 3 of request with message ID 22
7/19/2021 00:17,Info,ubnt01-sec.lan,02[NET] <peer-2.2.2.2-tunnel-vti|13> sending packet: from 1.1.1.1[4500] to 2.2.2.2[4500] (464 bytes)
7/19/2021 00:17,Info,ubnt01-sec.lan,02[IKE] <peer-2.2.2.2-tunnel-vti|13> retransmit 2 of request with message ID 22
7/19/2021 00:17,Info,ubnt01-sec.lan,11[NET] <peer-2.2.2.2-tunnel-vti|13> sending packet: from 1.1.1.1[4500] to 2.2.2.2[4500] (464 bytes)
7/19/2021 00:17,Info,ubnt01-sec.lan,11[IKE] <peer-2.2.2.2-tunnel-vti|13> retransmit 1 of request with message ID 22
7/19/2021 00:17,Info,ubnt01-sec.lan,14[NET] <peer-2.2.2.2-tunnel-vti|13> sending packet: from 1.1.1.1[4500] to 2.2.2.2[4500] (464 bytes)
7/19/2021 00:17,Info,ubnt01-sec.lan,14[ENC] <peer-2.2.2.2-tunnel-vti|13> generating CREATE_CHILD_SA request 22 [ SA No KE ]
7/19/2021 00:17,Info,ubnt01-sec.lan,14[IKE] <peer-2.2.2.2-tunnel-vti|13> initiating IKE_SA peer-2.2.2.2-tunnel-vti[15] to 2.2.2.2
7/19/2021 00:05,Info,ubnt01-sec.lan,06[IKE] <peer-2.2.2.2-tunnel-vti|13> CHILD_SA closed
7/19/2021 00:05,Info,ubnt01-sec.lan,06[IKE] <peer-2.2.2.2-tunnel-vti|13> received DELETE for ESP CHILD_SA with SPI c999eefe
7/19/2021 00:05,Info,ubnt01-sec.lan,06[ENC] <peer-2.2.2.2-tunnel-vti|13> parsed INFORMATIONAL response 21 [ D ]
7/19/2021 00:05,Info,ubnt01-sec.lan,06[NET] <peer-2.2.2.2-tunnel-vti|13> received packet: from 2.2.2.2[4500] to 1.1.1.1[4500] (80 bytes)
7/19/2021 00:05,Info,ubnt01-sec.lan,07[NET] <peer-2.2.2.2-tunnel-vti|13> sending packet: from 1.1.1.1[4500] to 2.2.2.2[4500] (80 bytes)
7/19/2021 00:05,Info,ubnt01-sec.lan,07[ENC] <peer-2.2.2.2-tunnel-vti|13> generating INFORMATIONAL request 21 [ D ]
7/19/2021 00:05,Info,ubnt01-sec.lan,07[IKE] <peer-2.2.2.2-tunnel-vti|13> sending DELETE for ESP CHILD_SA with SPI c12191de
7/19/2021 00:05,Info,ubnt01-sec.lan,07[IKE] <peer-2.2.2.2-tunnel-vti|13> closing CHILD_SA peer-2.2.2.2-tunnel-vti{134} with SPIs c12191de_i (5289 bytes) c999eefe_o (5289 bytes) and TS 0.0.0.0/0 === 0.0.0.0/0
7/19/2021 00:05,Info,ubnt01-sec.lan,07[IKE] <peer-2.2.2.2-tunnel-vti|13> outbound CHILD_SA peer-2.2.2.2-tunnel-vti{136} established with SPIs cd9504ec_i c4734dd7_o and TS 0.0.0.0/0 === 0.0.0.0/0
7/19/2021 00:05,Info,ubnt01-sec.lan,07[IKE] <peer-2.2.2.2-tunnel-vti|13> inbound CHILD_SA peer-2.2.2.2-tunnel-vti{136} established with SPIs cd9504ec_i c4734dd7_o and TS 0.0.0.0/0 === 0.0.0.0/0
7/19/2021 00:05,Info,ubnt01-sec.lan,07[CFG] <peer-2.2.2.2-tunnel-vti|13> selected proposal: ESP:AES_CBC_256/HMAC_SHA2_256_128/NO_EXT_SEQ
7/19/2021 00:05,Info,ubnt01-sec.lan,07[ENC] <peer-2.2.2.2-tunnel-vti|13> parsed CREATE_CHILD_SA response 20 [ SA No TSi TSr ]
7/19/2021 00:05,Info,ubnt01-sec.lan,07[NET] <peer-2.2.2.2-tunnel-vti|13> received packet: from 2.2.2.2[4500] to 1.1.1.1[4500] (208 bytes)
7/19/2021 00:05,Info,ubnt01-sec.lan,15[NET] <peer-2.2.2.2-tunnel-vti|13> sending packet: from 1.1.1.1[4500] to 2.2.2.2[4500] (288 bytes)
7/19/2021 00:05,Info,ubnt01-sec.lan,15[ENC] <peer-2.2.2.2-tunnel-vti|13> generating CREATE_CHILD_SA request 20 [ N(REKEY_SA) SA No TSi TSr ]
7/19/2021 00:05,Info,ubnt01-sec.lan,15[IKE] <peer-2.2.2.2-tunnel-vti|13> establishing CHILD_SA peer-2.2.2.2-tunnel-vti{136} reqid 2
7/19/2021 00:05,Info,ubnt01-sec.lan,15[KNL] creating rekey job for CHILD_SA ESP/0xc12191de/1.1.1.1

EdgeOS logs (peer)

Jul 19 00:00:39 08[KNL] creating acquire job for policy 192.168.1.1/32[tcp/34697] === 10.10.0.1/32[tcp/bgp] with reqid {2}
Jul 19 00:04:28 14[KNL] creating acquire job for policy 192.168.1.1/32[tcp/33693] === 10.10.0.1/32[tcp/bgp] with reqid {2}
Jul 19 00:05:42 05[IKE] <peer-1.1.1.1-tunnel-vti|11930> inbound CHILD_SA peer-1.1.1.1-tunnel-vti{1521} established with SPIs c4734dd7_i cd9504ec_o and TS 0.0.0.0/0 === 0.0.0.0/0
Jul 19 00:05:43 14[IKE] <peer-1.1.1.1-tunnel-vti|11930> closing CHILD_SA peer-1.1.1.1-tunnel-vti{1519} with SPIs c999eefe_i (5289 bytes) c12191de_o (5289 bytes) and TS 0.0.0.0/0 === 0.0.0.0/0
Jul 19 00:05:43 14[IKE] <peer-1.1.1.1-tunnel-vti|11930> outbound CHILD_SA peer-1.1.1.1-tunnel-vti{1521} established with SPIs c4734dd7_i cd9504ec_o and TS 0.0.0.0/0 === 0.0.0.0/0
Jul 19 00:08:01 08[KNL] creating acquire job for policy 192.168.1.1/32[tcp/40371] === 10.10.0.1/32[tcp/bgp] with reqid {2}
Jul 19 00:11:25 12[KNL] creating acquire job for policy 192.168.1.1/32[tcp/40541] === 10.10.0.1/32[tcp/bgp] with reqid {2}
Jul 19 00:14:24 09[IKE] <peer-1.1.1.1-tunnel-vti|11930> initiating IKE_SA peer-1.1.1.1-tunnel-vti[12071] to 1.1.1.1
Jul 19 00:14:53 06[KNL] creating acquire job for policy 192.168.1.1/32[tcp/45073] === 10.10.0.1/32[tcp/bgp] with reqid {2}
Jul 19 00:17:50 15[KNL] creating acquire job for policy 192.168.1.1/32[tcp/bgp] === 10.1.0.13/32[tcp/42177] with reqid {4}
Jul 19 00:17:50 10[IKE] <peer-1.1.1.1-tunnel-vti|12073> initiating IKE_SA peer-1.1.1.1-tunnel-vti[12073] to 1.1.1.1
Jul 19 00:18:34 06[KNL] creating acquire job for policy 192.168.1.1/32[tcp/42995] === 10.10.0.1/32[tcp/bgp] with reqid {2}
Jul 19 00:20:35 06[KNL] creating delete job for CHILD_SA ESP/0x00000000/1.1.1.1
Jul 19 00:21:54 05[KNL] creating acquire job for policy 192.168.1.1/32[tcp/36997] === 10.1.0.13/32[tcp/bgp] with reqid {4}
Jul 19 00:21:54 05[IKE] <peer-1.1.1.1-tunnel-vti|12075> initiating IKE_SA peer-1.1.1.1-tunnel-vti[12075] to 1.1.1.1
Jul 19 00:22:23 15[KNL] creating acquire job for policy 192.168.1.1/32[tcp/37755] === 10.10.0.1/32[tcp/bgp] with reqid {2}
Jul 19 00:24:39 12[KNL] creating delete job for CHILD_SA ESP/0x00000000/1.1.1.1
Jul 19 00:25:37 04[KNL] creating acquire job for policy 192.168.1.1/32[tcp/44323] === 10.1.0.13/32[tcp/bgp] with reqid {4}
Jul 19 00:25:37 04[IKE] <peer-1.1.1.1-tunnel-vti|12077> initiating IKE_SA peer-1.1.1.1-tunnel-vti[12077] to 1.1.1.1
Jul 19 00:25:50 12[KNL] creating acquire job for policy 192.168.1.1/32[tcp/38955] === 10.10.0.1/32[tcp/bgp] with reqid {2}
Jul 19 00:28:22 09[KNL] creating delete job for CHILD_SA ESP/0x00000000/1.1.1.1
Jul 19 00:29:28 13[KNL] creating acquire job for policy 192.168.1.1/32[tcp/44937] === 10.10.0.1/32[tcp/bgp] with reqid {2}
Jul 19 00:29:31 09[KNL] creating acquire job for policy 192.168.1.1/32[tcp/44945] === 10.1.0.13/32[tcp/bgp] with reqid {4}
Jul 19 00:29:31 07[IKE] <peer-1.1.1.1-tunnel-vti|12079> initiating IKE_SA peer-1.1.1.1-tunnel-vti[12079] to 1.1.1.1
Jul 19 00:32:16 07[KNL] creating delete job for CHILD_SA ESP/0x00000000/1.1.1.1
Jul 19 00:32:16 07[KNL] creating delete job for CHILD_SA ESP/0x00000000/1.1.1.1
Jul 19 00:32:41 05[KNL] creating acquire job for policy 192.168.1.1/32[tcp/41633] === 10.10.0.1/32[tcp/bgp] with reqid {2}
Jul 19 00:33:04 09[KNL] creating acquire job for policy 192.168.1.1/32[tcp/33455] === 10.1.0.13/32[tcp/bgp] with reqid {4}
Jul 19 00:33:04 07[IKE] <peer-1.1.1.1-tunnel-vti|12082> initiating IKE_SA peer-1.1.1.1-tunnel-vti[12082] to 1.1.1.1
Jul 19 00:35:49 13[KNL] creating delete job for CHILD_SA ESP/0x00000000/1.1.1.1
Jul 19 00:36:11 07[KNL] creating acquire job for policy 192.168.1.1/32[tcp/39795] === 10.10.0.1/32[tcp/bgp] with reqid {2}
Jul 19 00:36:29 10[KNL] creating acquire job for policy 192.168.1.1/32[tcp/35383] === 10.1.0.13/32[tcp/bgp] with reqid {4}
Jul 19 00:36:29 06[IKE] <peer-1.1.1.1-tunnel-vti|12083> initiating IKE_SA peer-1.1.1.1-tunnel-vti[12083] to 1.1.1.1
Jul 19 00:39:14 07[KNL] creating delete job for CHILD_SA ESP/0x00000000/1.1.1.1
Jul 19 00:39:36 06[KNL] creating acquire job for policy 192.168.1.1/32[tcp/42913] === 10.10.0.1/32[tcp/bgp] with reqid {2}
Jul 19 00:40:07 14[KNL] creating acquire job for policy 192.168.1.1/32[tcp/33337] === 10.1.0.13/32[tcp/bgp] with reqid {4}
Jul 19 00:40:07 05[IKE] <peer-1.1.1.1-tunnel-vti|12085> initiating IKE_SA peer-1.1.1.1-tunnel-vti[12085] to 1.1.1.1
Jul 19 00:42:52 06[KNL] creating delete job for CHILD_SA ESP/0x00000000/1.1.1.1

Notes

  • I’ve got monitoring setup to 2.2.2.2, and certainly have never received a down notification from the endpoint, so I don’t think its actually truly down and not listening.
  • EdgeOS logs show it continually trying to create an acquire job for the VPN. I can confirm that even now, many hours after the VPN went down, I can see IKE packets coming in on my WAN interface, and the VyOS device effectively ignores them:
    ryanb@ubnt01-sec:~$ sudo tcpdump -vvv -ni eth0 host 2.2.2.2
    tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
    08:34:50.609267 IP (tos 0x0, ttl 53, id 50840, offset 0, flags [DF], proto UDP (17), length 364)
        2.2.2.2.500 > 1.1.1.1.500: [udp sum ok] isakmp 2.0 msgid 00000000 cookie 459e4349059a92c1->0000000000000000: parent_sa ikev2_init[I]:
        (sa: len=44
            (p: #1 protoid=isakmp transform=4 len=44
                (t: #1 type=encr id=aes (type=keylen value=0100))
                (t: #2 type=integ id=#12 )
                (t: #3 type=prf id=#5 )
                (t: #4 type=dh id=modp1024 )))
        (v2ke: len=128 group=modp1024 681c74c49c11cc9ee42a67b91cd8b3f4fcbdfddfe530bba3276bdc0023556a06fc687659c5bf0922c681c24e15389062b54b597257382eeb26308631b38a56b670d95872ae84d005ae21b266fab6cb7bab685702255a636851d6e60d23ab44f577021ccb803dc272e12c4713e0c7cd7a6beb7ec5cb2129b7bb6f5655da79c3ed)
        (nonce: len=32 nonce=(0f68cde8bd8dbf317fb897f60261b33fb7bd77c80b9080141f7f2eb133f6f4f0) )
        (n: prot_id=#0 type=16388(nat_detection_source_ip))
        (n: prot_id=#0 type=16389(nat_detection_destination_ip))
        (n: prot_id=#0 type=16430(status))
        (n: prot_id=#0 type=16431(status))
        (n: prot_id=#0 type=16406(status))
    08:35:32.599670 IP (tos 0x0, ttl 53, id 52672, offset 0, flags [DF], proto UDP (17), length 364)
        2.2.2.2.500 > 1.1.1.1.500: [udp sum ok] isakmp 2.0 msgid 00000000 cookie 459e4349059a92c1->0000000000000000: parent_sa ikev2_init[I]:
        (sa: len=44
            (p: #1 protoid=isakmp transform=4 len=44
                (t: #1 type=encr id=aes (type=keylen value=0100))
                (t: #2 type=integ id=#12 )
                (t: #3 type=prf id=#5 )
                (t: #4 type=dh id=modp1024 )))
        (v2ke: len=128 group=modp1024 681c74c49c11cc9ee42a67b91cd8b3f4fcbdfddfe530bba3276bdc0023556a06fc687659c5bf0922c681c24e15389062b54b597257382eeb26308631b38a56b670d95872ae84d005ae21b266fab6cb7bab685702255a636851d6e60d23ab44f577021ccb803dc272e12c4713e0c7cd7a6beb7ec5cb2129b7bb6f5655da79c3ed)
        (nonce: len=32 nonce=(0f68cde8bd8dbf317fb897f60261b33fb7bd77c80b9080141f7f2eb133f6f4f0) )
        (n: prot_id=#0 type=16388(nat_detection_source_ip))
        (n: prot_id=#0 type=16389(nat_detection_destination_ip))
        (n: prot_id=#0 type=16430(status))
        (n: prot_id=#0 type=16431(status))
        (n: prot_id=#0 type=16406(status))
    
  • Even using clear vpn ipsec-peer on the EdgeOS side does not ever get anything logged in VyOS. It just ignores the INIT request entirely.

Thoughts on how I can proceed here?

Try to set such settings on VyOS site.

set vpn ipsec ike-group ike-azure dead-peer-detection action 'restart'
set vpn ipsec ike-group ike-azure dead-peer-detection interval '30'
set vpn ipsec ike-group ike-azure dead-peer-detection timeout '120'

Thanks for that. Will try enabling DPD to see the behavior. Will report back in a few days with results.

Any chance there’s an option that doesn’t require DPD? Even a 30sec downtime (if I set my DPD to 10/30) is a lot in some prod cases, so trying to minimize that as much as possible.

Thanks!

How about using wireguard?

I use Wireguard for many point-to-site things, but most platforms don’t (yet, i’m sure) support it natively. EdgeOS has an installation script that’s mildly clunky but generally works, but things like Azure VPN Gateways don’t support it, and likely never will.