VyOS - Proxmox vm maximum router throughput - Which is faster ( 1-Socket & 12 Cores ) or ( 2-Sockets & 6 Cores )

VyOS - Proxmox vm maximum router throughput - Which is faster ( 1-Socket & 12 Cores ) or ( 2-Sockets & 6 Cores ) -or- ( 4 Sockets & 3 Cores ) ?

I am running some VyOS routers under Proxmox 8.2.4 on a physical machine with ( 48 x Intel(R) Xeon(R) Gold 6136 CPU @ 3.00GHz (4 Sockets) ) )
( note - Four physical 12-Core CPUs ).

BIOS HT - disabled
VyOS set for performance
40-Gig NICs ( normal average sustained throughput is 3 to 7-Gig )
I have VyOS BGP routers & VyOS OSPF routers & VyOS NAT routers
VyOS 1.5-rolling-202407300021

My question ( below ):
What CPU setting ( in Proxmox ) will/should provide the fastest VyOS vm router ?

  • ( 1 Socket & 12 Cores )
  • ( 2 Sockets & 6 Cores )
  • ( 4 Sockets & 3 Cores )
    – All three above configurations are set with Enable-NUMA ( enabled ).
    – I am not using SR-IOV or special dedicated network card configurations in Proxmox.

I’ve searched and googled and was unable to find a clear answer , so I thought I would ask here.

Thanks for any answers. — Like everybody else , I’m trying to find & understand what works the fastest.

North Idaho Tom Jones

You need to check an instruction for the motherboard.
Check the scheme or NUMA topology and on which speed they communicate between each other.

In general I saw the problems only with 2 numas. Access to the remote bank of RAM up to 2 times slower than to the local.

Here is a thread on this topic:

Also relevant:

https://publicdoc.rbbn.com/display/SBXDOC111/KVM+Performance+Tuning

In short I think the answer is “it depends”.

Exposing the NUMA to the guest will help the guest OS to better schedule its internal threads to avoid changing CPU over at the host (which will result in cache misses) which it otherwise is not aware of. At the same time I have seen some posts that single socket will have less overhead (dunno if thats true - I havent dug through the KVM code of the Linux kernel :slight_smile:

What might gain performance more is if you start doing CPU affinity and by that “lock” or “pin” various VM’s to various sockets and cores on the host.

You could also do both like pin your high performance guests to dedicated cores and let all other VM’s share some other set of cores not used by the high performance guests.

You also should check what CPU the PCI lanes used for the NIC are, but that may not make a huge difference if you are not using SR-IOV as the para-virtualised NIC threads may end up anywhere. Why not using SR-IOV (or full PCI card pass through) if you care about speed?

If I am correct , I believe that Proxmox vm machines that use SR-IOV and/or PCI passthrough , can not be live migrated to another Proxmox hypervisor.
Also , I am pretty weak on configuring SR-IOV & PCI passthrough ( I have no excuses - just need to learn it and get the learning process for the two under by belt - I am planning on getting some new servers and one of those will be a lab unit for testing and learning… )

What breaks is probably the physical MAC-address of the NIC at host1 will be different from the one at host2.

However many NICs supports setting custom MAC so a possible workaround would be to have the Proxmox set the MAC on the destination to your custom MAC and also send out GARP’s so devices connected to the Proxmox will pickup where this MAC is now physically located.

Also note that you will probably need to enable “fast mac move” or whatever it might be called in your L2 layer if you got such otherwise the switchlager will blackhole a MAC address that moves to often between interfaces.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.