In case it helps anyone, I have two personal locations and in each one I have a ProtectLI Vault 6-port router, running VyOS 1.5-rolling-202406300022 - they link the two locations together with an OpenVPN tunnel and run BGP over the tunnel to keep routes figured out between the two locations. On one end the ProtectLI is a Core i5 with 6 gigabit ports, and on the other end the ProtectLi is a Core i7 with 6 2.5Gig ports. I recently did an rsync backup of several gigs of data through the link and managed to push OpenVPN to a sustained 70megabit speed for around 45minutes running. I noticed that the OpenVPN process on both routers was hovering around 25-29% of the CPU. Both of these devices seem to run VyOS very well, and I’m not sure why they aren’t on the HCL. I know ProtectLi used to sell devices years ago with VyOS 1.2.4 on them. Not sure what happened there.
But just in case anyone was wondering if the new dailies are working well, this one sure is.
Just if anyone’s interested in what I confirm is working well, I’m running the following on these vyos-ProtectLi lab routers:
OpenVPN - described above.
OSPF - for routes around each of my lab locations and other links to cloud lab sites, etc
BGP - to send routes through the tunnel to both sides, and redistribute the OSPF routes
Firewall - Zone based for forwarding. regular style for internal access
DHCP Server
LLDP
NTP Server
SNMP Service for MRTG monitoring
All working well. A few months back I had a mid 2023 rolling release on these routers, and instead of OpenVPN I was running Wireguard. It was VERY unreliable and (which surprised me) very slow. I switched to OpenVPN and everything smoothed right out. and was (still can’t believe it) faster. I run a few long-term WireGuard tunnels on 1.2.4 and 1.2.9-S1 and they work well, but these recent ones on the rolling 1.5 were very poor. Had to keep rebooting the routers to bring up WireGuard.
I checked my two ProtectLi routers, and it seems that, without any actions on my part, the interface sections of the config have automatically included offload of gro, gso, sg, and tso. But this is at the ethernet interface level. I don’t have any offloads specifically called out in the OpenVPN section. I am using device-type tap as it seems to help some of the routing protocols work better - OSPF at least. If I run sudo ps -ef | grep vpn I see my openvpn process running, but most of the settings are contained in vtun0.conf. finding that file in /run/openvpn/vtun0.conf and looking at the contents, I see disable-dco so it actually looks like I’m hitting those nice high speeds even with offload disabled in openvpn. I’ll see if I can get DCO enabled and report back on any impact to my speed.
Generally speaking wireguard should outperform openvpn no matter if you do single or multicore for both - if you observe the other way around you probably have some kind of other malfunction going on.
Also try to enable those offloading options for your interface one by one because for some usecases those offloading options seems to be the opposite when it comes to performance.
In short:
Disable all offloading options.
Reboot.
Do benchmark to find out baseline.
Enable one of the offloading options.
Reboot
Redo benchmark.
Disable all offloading options and enable another of the offloading options.
Reboot
Redo benchmark.
Disable all… and so on.
So you have a baseline and then test one offloading at a time to find out which of the offloading options have effect and how much (and if its an increase or decrease in performance).
If possible please post back to this thread what your metrics were with each of the offloading settings (would be helpful for others with the same hardware since which offloading options who works (and which have a decrease in performance) depends on drivers and which nic you got).
Ok, that will take some time. I’ll also try a switch back to Wireguard now that I’m on a new daily version, and evaluate that.
I am a littl skeptical though about how the offload functions at the ethernet level will impact performance at the vpn level (wireguard or openvpn). The ethernet interfaces on these routers are capable of gigabit-and-higher speeds, and I commonly test their speed and post speeds at their max without issue. The vpn protocols are running at around a 20th of that speed. Hard to understand how offloads at the much faster ethernet level will affect that. But we’ll see, I guess.
So the easiest ones I could try was enabling and disabling Openvpn DCO. Here is what I found:
(test iperf -c -r -t 20 -P 4)
DCO disabled:
iperf throughput 66.7megabits/sec
CPU usage 37%
DCO enabled:
iperf througput 62.7megabits/sec
CPU 8%
So far it seems like OpenVPN DCO is hugely cutting down CPU usage and actually slowing things down a bit. Seems a better option to just work the CPU hard.
I disabled OpenVPN DCO to get back to “full speed”. My best tunnel performance was 72.7megabit with DCO disabled.
Here’s what I found:
I disabled all offloads on the ethernet interfaces supporting the iperf test. The internal LAN interface which is picking up all the traffic from the iperf machines and putting it into the tunnel, and also the WAN interface that carries the tunnel traffic itself. Performance immediately dropped to 57.3megabit
enabling gro offload only on both WAN and LAN immediately brought the tunnel performance right back to around 70megabit
next I disabled all offload and only re-enabled gso on both WAN and LAN. performance was worst yet - 54megabit
next I disabled all offload and only re-enabled sg on both WAN and LAN. performance was pretty good, at 67megabit
next I disabled all offload and only re-enabled tso on both WAN and LAN. performance was a bit better, at 69megabit.
There are so many permutations here, and it’s probably a poor assumption that the WAN and LAN interface should be treated the same. The LAN interface is handling all of the TCP traffic from IPerf, but the WAN interface is handling the UDP OpenVPN tunnel traffic. Probably much different impacts of the various offloads.
However, it seemed that gso might be detrimental, so I tried enabling the three others, gro, sg, and tso. OpenVPN tunnel performance went to 72.8megabit. So it seems that might be the best for OpenVPN.
Are these 2 boxes directly connected to each other on a desk? Or are you doing these tests over a WAN? If you’re doing them over a WAN, what are the speeds at both ends?
70Mbps between an i5 and i7 is pretty bad, even for OVPN. For reference, I’ve done these speeds recently (without DCO):
No they’re connected over a WAN - in two different countries actually. This is a pretty good speed given other similar connections I’ve run between these locations using various protocols, IPSec, other OpenVPN implementations, Wireguard, etc.
That’s impressive speeds you’ve managed to hit. It really makes me scratch my head when everyone keeps saying OpenVPN is so clunky and slow etc. For me it is the best VPN tech I’ve ever seen in terms of overal simplicity of setup, reliability over time, interoperability between platforms, resistance to all kinds of MTU weirdness, and flexibility to get through NATs, bridges, and other weird stuff. Also have carried multiple VLANs inside it, VLANs-in-GRE-in-OpenVPN, and on and on. Your results are encouraging.
So the limiting factor is likely the WAN throughput then (barring some really bad config somewhere). And that likely explains why you see worse performance with DCO and WireGuard, which should be able to pump out decent (>1Gbps) throughput on your hardware.
You’re likely seeing the result of a more capable TCP stream against a low throughput pipe. TCP will try to ramp up speeds, and then drop due to congestion, leading to exponential backoff until it hits that congestion again. The bigger the difference in the overlay and underlay, the worse TCP will perform.
You can configure a shaper to clean up that traffic if you wanted to use DCO for the reduced CPU utilization. That should allow you to see the same (or greater) performance as your baseline.
The easiest way to see what I’m talking about using iPerf is to do UDP tests. You can keep bumping up the speed you test with until you start seeing a large amount of loss. Whenever you get to a consistently clean bandwidth without loss, that can be what you shape traffic to. You may even beat your 70Mbps if that value is partially the result of TCP behavior.
OpenVPN is a bit clunky and slow, but it can still have its place. I only really use it when I need to VPN over something that blocks UDP. For reference, here are some more results using the hardware I mentioned earlier:
J4125:
OpenVPN (without DCO): 226Mbps
IPsec: 1.36Gbps
WireGuard: 1.57Gbps
i9-13900H:
OpenVPN (without DCO): 2.18Gbps
IPsec: 5.72Gbps
WireGuard: 6.55Gbps
OpenVPN is much slower than other options, but it can still saturate the WAN links for a lot of people with decent hardware.
Over standard broadband connections though, I still find it great. Those speeds you’re hitting are way beyond regular WAN connection speeds in a lot of real-world branch offices etc. Makes it still very useable.
I run a “use if all else fails” OpenVPN on TCP443 that I use to connect to my infrastructure from some hotels that block udp or non-standard ports, or office buildings that do the same thing. So useful. I’ve managed to get a tunnel running through some serious stateful packet checking firewalls with this setup - and I’m sure even one or two SSL-inspection setups.
Yep, same! Still a great use case for OpenVPN. You can often times do the same with UDP by changing the port to: 53, 123, 4500, etc…, but TCP443 with OpenVPN has been consistent for me.