I tested DMVPN according to the document and it worked normally for the first time. But after the HUB restarts, the DMVPN network cannot continue to be used normally. I need to restart SPOKE to restore them normally. Checking the information shows that it can be solved after adding the DPD of IKE, but the problem still exists after I tested it.
I tested it in EVE-NG, and the vyos version is as follows:
vyos@vyos# run show version
Version: VyOS 1.3-rolling-202009030118
Release Train: equuleus
Built by: autobuild@vyos.net
Built on: Thu 03 Sep 2020 01:18 UTC
Build UUID: 83c8515b-213b-4ac6-9b9a-2a0f24bbbda7
Build Commit ID: 221fd153830307
Architecture: x86_64
Boot via: installed image
System type: KVM guest
Hardware vendor: QEMU
Hardware model: Standard PC (i440FX + PIIX, 1996)
Hardware S/N: Unknown
Hardware UUID: Unknown
Copyright: VyOS maintainers and contributors
The DPD add command is as follows:
set vpn ipsec ike-group IKE-SPOKE dead-peer-detection action restart
set vpn ipsec ike-group IKE-SPOKE dead-peer-detection interval 3
set vpn ipsec ike-group IKE-SPOKE dead-peer-detection timeout 3
Hi,@Dmitry
This parameter does not solve the substantive problem. After the HUB is restarted or disconnected and restored, DMVPN still cannot be restored normally.
Restarting the service still cannot be used normally, but I restarted the vyos system and it was normal.
HUB
vyos@vyos:~$ show nhrp tunnel
Status: ok
Interface: tun0
Type: local
Protocol-Address: 9.9.9.255/32
Alias-Address: 9.9.9.1
Flags: up
Interface: tun0
Type: local
Protocol-Address: 9.9.9.1/32
Flags: up
Interface: tun0
Type: dynamic
Protocol-Address: 9.9.9.3/32
NBMA-Address: 10.10.3.2
Flags: up
Expires-In: 0:28
Interface: tun0
Type: dynamic
Protocol-Address: 9.9.9.2/32
NBMA-Address: 10.10.2.2
Flags: up
Expires-In: 0:22
vyos@vyos:~$ ip neigh
9.9.9.3 dev tun0 lladdr 10.10.3.2 REACHABLE
10.10.1.1 dev eth0 lladdr 50:00:00:01:00:00 DELAY
9.9.9.2 dev tun0 lladdr 10.10.2.2 REACHABLE
vyos@vyos:~$
vyos@vyos:~$ show vpn ipsec sa
Connection State Uptime Bytes In/Out Packets In/Out Remote address Remote ID Proposal
------------------ ------- -------- -------------- ---------------- ---------------- ----------- ----------------------------------
dmvpn-NHRPVPN-tun0 up 46m20s 2K/2K 23/22 10.10.2.2 N/A AES_CBC_256/HMAC_SHA1_96/MODP_1024
dmvpn-NHRPVPN-tun0 up 52m19s 26K/29K 236/237 10.10.3.2 N/A AES_CBC_256/HMAC_SHA1_96/MODP_1024
Spooke1
vyos@vyos:~$ show nhrp tunnel
Status: ok
Interface: tun0
Type: local
Protocol-Address: 9.9.9.255/32
Alias-Address: 9.9.9.2
Flags: up
Interface: tun0
Type: local
Protocol-Address: 9.9.9.2/32
Flags: up
Interface: tun0
Type: static
Protocol-Address: 9.9.9.1/24
NBMA-Address: 10.10.1.2
Flags: up
vyos@vyos:~$
vyos@vyos:~$ ip neigh show
192.168.1.1 dev tun0 FAILED
10.10.2.1 dev eth0 lladdr 50:00:00:02:00:00 REACHABLE
9.9.9.1 dev tun0 lladdr 10.10.1.2 STALE
9.9.9.3 dev tun0 FAILED
vyos@vyos:~$
vyos@vyos:~$ show vpn ipsec sa
Connection State Uptime Bytes In/Out Packets In/Out Remote address Remote ID Proposal
------------------ ------- -------- -------------- ---------------- ---------------- ----------- ----------------------------------
dmvpn-NHRPVPN-tun0 up 2s 12K/11K 104/102 10.10.1.2 N/A AES_CBC_256/HMAC_SHA1_96/MODP_1024
dmvpn-NHRPVPN-tun0 up N/A N/A N/A N/A N/A N/A
vyos@vyos:~$
vyos@vyos:~$ sudo /etc/init.d/opennhrp.init restart
Restarting Next Hop Resolution Protocol: opennhrp.
vyos@vyos:~$
vyos@vyos:~$ ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
From 2.2.2.2 icmp_seq=1 Time to live exceeded
From 2.2.2.2 icmp_seq=2 Time to live exceeded
The log is as follows
vyos@vyos# sudo /etc/init.d/opennhrp.init restart
Restarting Next Hop Resolution Protocol: opennhrp.
[edit]
vyos@vyos#
[edit]
vyos@vyos# run show log tail 300
Sep 24 07:10:06 vyos pdns_recursor[1928]: Failed to update . records, RCODE=-1
Sep 24 07:11:36 vyos systemd[1]: Starting Cleanup of Temporary Directories...
Sep 24 07:11:36 vyos systemd-tmpfiles[2439]: [/usr/lib/tmpfiles.d/heartbeat.conf:3] Line references path below legacy directory /var/run/, updating /var/run/heartbeat 鈫?/run/heartbeat; please update the tmpfiles.d/ drop-in file accordingl
y.
Sep 24 07:11:36 vyos systemd-tmpfiles[2439]: [/usr/lib/tmpfiles.d/heartbeat.conf:4] Line references path below legacy directory /var/run/, updating /var/run/heartbeat/ccm 鈫?/run/heartbeat/ccm; please update the tmpfiles.d/ drop-in file ac
cordingly.
Sep 24 07:11:36 vyos systemd-tmpfiles[2439]: [/usr/lib/tmpfiles.d/heartbeat.conf:5] Line references path below legacy directory /var/run/, updating /var/run/heartbeat/crm 鈫?/run/heartbeat/crm; please update the tmpfiles.d/ drop-in file ac
cordingly.
Sep 24 07:11:36 vyos systemd-tmpfiles[2439]: [/usr/lib/tmpfiles.d/heartbeat.conf:6] Line references path below legacy directory /var/run/, updating /var/run/heartbeat/dopd 鈫?/run/heartbeat/dopd; please update the tmpfiles.d/ drop-in file
accordingly.
Sep 24 07:11:36 vyos systemd-tmpfiles[2439]: [/usr/lib/tmpfiles.d/resource-agents.conf:1] Duplicate line for path "/run/resource-agents", ignoring.
Sep 24 07:11:36 vyos systemd[1]: systemd-tmpfiles-clean.service: Succeeded.
Sep 24 07:11:36 vyos systemd[1]: Started Cleanup of Temporary Directories.
Sep 24 07:12:09 vyos pdns_recursor[1928]: Failed to update . records, got an exception: Server Failure while retrieving DS records for net
Sep 24 07:12:09 vyos pdns_recursor[1928]: Failed to update . records, RCODE=-1
This is an old thread, but I believe the issue here may have been that DPD was missing on the hub. I was also having the same problem on 1.3.1-S1, where I had DPD set on the spoke, would reboot the hub, and the spoke would never re-register once the hub was back up. However, after adding DPD to the hub too, and rebooting the hub, this is working as expected. Now when I reboot the hub, within about 5 minutes the spoke has re-registered with the hub and is reachable again.
I’m on 1.3.0-rc6, and configuring DPD on my hub and spokes does not result in spokes coming back online after a hub reset. I left them overnight last night to see if they would eventually come back online, and they didn’t.
@aohanian Can you share your hub and spoke config so I can see if there are any differences that may result in this behavior?
I am only able to get a spoke reconnected to a rebooted hub by calling sudo /etc/init.d/opennhrp.init restart (or rebooting the spoke). I could write a script to ping the hub and restart NHRP after a set number of failed pings, but this feels hacky and I would feel better if I could get the router to work correctly in the first place.
Here is an example hub and spoke config from my GNS3 virtual lab. Everything seems to reconnect fine for me a few minutes after rebooting the hub. I’m using a newer self build of VyOS 1.3.
Thank you for sharing your configs! Unfortunately, when my spokes try to reconnect, the traffic is only GRE, and no attempt is made to establish an IPSec tunnel. Does your configuration reconnect with your iptables rules to block GRE traffic?