VyOS in datacenter enviroment

Hello everyone!

We are looking for a core/edge router for our data center infrastructure. We’ve checked the prices for a branded router like Cisco or Juniper (the most recommended) but the prices are over the moon. At the minute we are using the Mikrotik CCR1009 and CCR1036 series but seem to have some problems with version 7 in terms of BGP. Again there are pros and cons to Mikrotik in data centers, and for that reason, we are thinking to see if the VyOS will fit our needs.

Is anyone using VyOS in a DC environment and can share details or performances?

What is your experience with BGP and Firewall? (we need a bit of protection too, or maybe you recommend something else as a firewall)

At the minute we don’t have much traffic because we just started our colocation journey but we are looking first to have a stable infrastructure and then start pushing traffic. In the future, we are looking to provide colocation too and for that reason, we want to be sure VyOS or other platforms will cover us.

We just installed the latest version of VyOS on an HP DL20 Gen 9 for testing purposes and to see how easy/hard is to configure it. We are open to any advice in terms of which bare metal will be good for VyOS.

Any idea when the DPDK feature will be released?

Thanks,
Alex

Since VyOS is a router with firewall capabilities rather than a firewall with routing capabilities this means that the default is to allow all traffic (which can be changed but leaks might still remain).

So from that perspective getting a “proper” firewall such as OPNsense (or if you have a large wallet and high demands on NGFW features such as SSL-termination, IPS functionality etc with high performance is needed then PaloAlto Networks is often considered best in class) might be needed in your case in addition to VyOS.

A more complex setup might be to continue to use your Mikrotiks (since they got hardware offloading) but offload the BGP stuff into a VyOS as a route-reflector. This means that the BGP process goes at VyOS but the actual packet flows (the routed traffic) never touches the VyOS (from Mikrotik perspective that will just handle static routes).

But up to 10Gbps I think VyOS will work just fine. It will also work with 100Gbps and above with proper NIC’s and fast enough CPU but here the difference becomes larger from hardware plattforms (utilizing for example Broadcom-chips as ASIC/FPGA to do the heavy lifting of pushing packets in the dataplane) where mainly latency and pps (packes per second) will be affected. The current work of VPP (DPDK) will hopefully fix most of that. You can enable VPP today but consider it as a beta or similar: VyOS Project July 2023 Update

Overall there are currently an issue with high commit/boot times when you have more than a few hundred static routes and/or firewall rules. That is the commit/boot times increase dramatically to several minutes or more. But hopefully this will get fixed in future (reason is massive overhead when the config is processed line by line instead of batching each config section): ⚓ T5388 Something is fishy with commit and boot times when more than a few hundred static routes are being used

From performance point of view there have recently been a refactoring of the firewall code (or rather the frontend towards nftables which is being used as firewall engine) and hopefully software offloading aka software fastpath in form of flowtable will be included soon to improve performance when firewall is enabled: Software fastpath with nftables flowtable | firewalld

You can of course add this manually since the backend linux enivornment is available and this can also be scripted by pre/postconfig scripts along with pre/posthooks for when commits are being runned.

When it comes to performance there are a few tweakable options to test when it comes to moving packets (not all might be good in your case depending on which NIC’s are being used) - not all config lines below are performance related but could be good for you to consider if you want to have tweaked or not (you can for example leave the conntrack timeouts to defaults but then kernel defaults will be used which for example for established traffic means the conntrack will remain for 2 weeks even if not a single packet have passed that flow):

set firewall global-options all-ping 'enable'
set firewall global-options broadcast-ping 'disable'
set firewall global-options ip-src-route 'disable'
set firewall global-options ipv6-receive-redirects 'disable'
set firewall global-options ipv6-src-route 'disable'
set firewall global-options log-martians 'enable'
set firewall global-options receive-redirects 'disable'
set firewall global-options resolver-cache
set firewall global-options resolver-interval '60'
set firewall global-options send-redirects 'disable'
set firewall global-options source-validation 'strict'
set firewall global-options syn-cookies 'enable'
set firewall global-options twa-hazards-protection 'disable'


set interfaces ethernet ethX offload gro
set interfaces ethernet ethX offload gso
set interfaces ethernet ethX offload lro
set interfaces ethernet ethX offload rfs
set interfaces ethernet ethX offload rps
set interfaces ethernet ethX offload sg
set interfaces ethernet ethX offload tso
set interfaces ethernet ethX ring-buffer rx '4096'
set interfaces ethernet ethX ring-buffer tx '4096'

set system conntrack expect-table-size '10485760'
set system conntrack hash-size '10485760'
set system conntrack log icmp new
set system conntrack log other new
set system conntrack log tcp new
set system conntrack log udp new
set system conntrack table-size '10485760'
set system conntrack timeout icmp '10'
set system conntrack timeout other '600'
set system conntrack timeout tcp close '10'
set system conntrack timeout tcp close-wait '30'
set system conntrack timeout tcp established '600'
set system conntrack timeout tcp fin-wait '30'
set system conntrack timeout tcp last-ack '30'
set system conntrack timeout tcp syn-recv '30'
set system conntrack timeout tcp syn-sent '30'
set system conntrack timeout tcp time-wait '30'
set system conntrack timeout udp other '600'
set system conntrack timeout udp stream '600'

set system ip arp table-size '32768'
set system ip disable-directed-broadcast
set system ip multipath layer4-hashing
# Enable below to disable IPv6 forwarding:
#set system ipv6 disable-forwarding
set system ipv6 multipath layer4-hashing
set system ipv6 neighbor table-size '32768'

set system option ctrl-alt-delete 'reboot'
set system option keyboard-layout 'se-latin1'
set system option performance 'throughput'
set system option reboot-on-panic
set system option startup-beep
set system option time-format '24-hour'

And if you want to enable bbr as congestion control: https://blog.apnic.net/2017/05/09/bbr-new-kid-tcp-block/

set system sysctl parameter net.core.default_qdisc value 'fq'
set system sysctl parameter net.ipv4.tcp_congestion_control value 'bbr'

along with adding this to /config/scripts/vyos-preconfig-bootup.script

/sbin/modprobe tcp_bbr
5 Likes