Here are a few notes about things that I’ve run into, which may help other people. I run an assortment of cloud labs, and a couple physical labs. Cloud labs are in Azure, AWS, GC, Dream, etc. Physical labs actually run a lot of vmware and KVM machines. All of these locations are linked together using vyos tunnels - mostly OpenVPN but a few WireGuard as well. A while back, every single one of these vyos routers was running 1.1.8. But now a few of them are on 1.2.x and one or two are on 1.3-rolling. Here’s what I’ve sen.
DHCP SERVER - I had a lot of difficulty getting dhcp server to survive an update from 1.1.8 to 1.2.x. - usually had to re-do the dhcp-server part of the config from scratch. I also found that it was better to do a fresh install of 1.2 than to try to migrate from1.1.8 if a dhcp server was part of the config. Thankfully I don’t run a lot of dhcp servers on vyos routers, so this wasn’t too much of a hassle.
OPENVPN - sometime in the 1.2 generation a small item sneaked into the openvpn setup in vyos that gave me a LOT of grief until I figured it out and just made it part of my standard. The little command “persistent-tunnel” needs to be present in any site-to-site OpenVPN or your life will turn into a living hell. wasn’t necessary before, but it sure is critical now.
WIREGUARD - as you’d expect with WireGuard being a lot newer than OpenVPN you really have to pay attention to your versions at the end of each tunnel. I maintain a 1.2.5 LTS vyos router at the hub of my VPNs, and also a 1.3-rolling router. I try what I can on the rolling version, and if there are any reliability issues, I just move the tunnel over to 1.2.5 and try again later. One of the things I noticed - I had a 1.2.5 vyos router in Azure, and it was connected via WireGuard to the 1.2.5 router at the hub. Very nice and fast. When I moved it over to the 1.3-rolling vyos router at the head-end (keeping 1.2.5 at the other end), I started to have MTU problems. Large streams of traffic (such as a big rsync) going through the VPN would freeze up and fail after a short run - although ping worked fine the whole time. This was with vyos-1.3-rolling around Feb 2020 build. When I upgraded the rolling version to 1.3-rollling-May2020 wireguard stopped working all together. Looks like progressive updates to the vyos version causes it to work worse and worse with older versions, as the version difference between the two ends of the tunnel diverge further, until it stops working completely. with 1.2.5-lts on one end of a connection and 1.3-rolling-May2020 on the other, the tunnel comes up, and handshakes work, but no traffic traverses the tunnel. Revert over to 1.2.5 on both ends, and it communicates just fine. As a comparison, OpenVPN interoperated between a wide range of vyos versions - including some older than 1.1.8 - as long as you remember that pesky "persistent-tunnel’ on new versions.
OPENVPN Redux - Another little surprise cropped up on 1.3-rolling-May2020 that I hadn’t seen back in the February build. I run a lot of UDP tunnels, and my tunnels started to fail, with the syslog complaining that OpenVPN didn’t know whether to use ipv4 or ipv6. I did some research and OpenVPn now expects “proto udp4” or “proto udp6” instead of “proto udp”. These options don’t yet exist in vyos as far as I can tell, but I managed to get my tunnels working again by adding ‘openvpn-option “proto udp4”’ to my config for the tunnel interfaces, along with the vyos-standard ‘protocol udp’.
Hope this helps other people scratching their heads.