10-Gig to 100-Gig throughput VyOS routers , bare-metal vs SR-IOV on a Proxmox server - recommendations
I am an ISP , with two 10-Gig BGP connections ( four if you count my IPv6 BGP servers also ). I am going to be upgrading to 100-Gig BGP interfaces this year.
I am getting ready to fork-life replace a dozen+ Proxmox servers , with newer faster servers with 100-Gig interfaces and the fastest Xeon CPUs & ram possible.
At this time, I am running 13 virtual VyOS routers on my Proxmox servers ( BGP , OSPF , CGN-nat … ) and I need to go faster than 10-Gig ( preferably be able to sustain 25 to 40+ gig or more throughput on all Vyos routers ).
Which leads me so some questions when using 100-Gig interfaces :
What is the real-world achievable sustainable throughput on a vm VyOS router using SR-IOV on a high-end Proxmox server ( no other VMs ) ( without SR-IOV and with SR-IOV )?
What is the real-world achievable sustainable throughput on a VyOS router using bare-metal 100-Gig interfaces on a high-end server ?
What hardware is recommended to be as fast as possible ( would like to be close to 100-Gig or what ever is possible ) , and lets assume the hardware budget is almost unlimited ?
I look forward to your suggestions and replies - thank you
Not a direct answer to your question but:
I’d suggest it’s worth talking to VyOS Support about your requirements - they have the VyOS VPP addon that would help you go fast, fast, fast.
At the speed and scale you’re talking about, spend a little of that unlimited hardware budget on getting some good solid support/help behind you.
Higher speeds like 40-100G are lot more difficult achieve especially if you’re doing nat or firewalls too. The biggest problem with software routing is the kernel can only handle so many packets per second, so while 100G of large 9KB packets is not to difficult, 100G of 64B packets would be extremely difficult if not impossible with standard linux kernel. The solutions to this are offloading some of the work to the NIC and or bypassing the slow path in the kernel via nftables flowtable offload, vpp, or some other similar tool. VyOS has nftables offload available in the stable 1.4 release right now which does work well for improving performance on NAT and some firewall rules, however i not know if it works with the CGNAT feature on 1.5 and it has the caveat of not working well with asynchronous routing. For maximum performance the solution will be VPP. VyOS has a tech preview available for a VPP addon that can be installed on 1.5 and can get significant throughput, but this is not production ready. Below are some links to Benchmarks done by the vyos maintainers showing VPP/ flowtables/default performance, and some for info about how flowtables and vpp work.
As for the setup you will definitely want to do PCI passthrough or SR-IOV if you’re not doing bare metal. VPP requires NICs that have DPDK support and i would highly recommend either Intel E810 or Mellanox ConnectX-5+ cards as they will work the best with VPP and they have great offloads for good performance even without the VPP/Flowtables.
The real-world achievable sustainable throughput is a little more subjective to your environment, it will all come down to the traffic patterns of your network. And something to consider is even if your normal workloads are workable a high PPS DDOS could easily overwhelm the router if there is no upstream filtering. But that being said a bare metal install vs a VM that has a passed through or SR-IOV NIC will be about the same assuming its the only VM on the hypervisor, but bare metal would probably be simplest for tuning. You would not want to do a VM that is not passed through, you will then be having the kernel on the hypervisor and the VM process the traffic and that is inefficient and slow (yes there are DPDK offloaded bridges like OVS but outside of the scope of this objective). When using VPP your CPU core speed is significantly more important than core count, outside of that it will really depend on your traffic.
re VPP, wow !!!
VPP is a new software concept for me. WOW !
After researching and studying VPP , I can see & understand how VPP will provide a huge boost in a software router’s throughput by an easy 2 to 4+ greater magnitude in throughput.
I can see how VPP might be part of all future software products ( Hypervisors , software routers , any software routine in just about anything that needs high/fast throughput in processing large repeating chunks of data ).
VPP is getting close ( closer ) to ASIC hardware throughput !
Now I want to test out some VPP software routers & check/see if/what software/VPP routers are stable and what is the real-world throughput on them. ( aka - drill three holes in some boards … )
re: ConnectX
I am starting to look into what 100+ Gig network cards function in a PCIe 5.0 bus/slot - such as the ConnectX-7.