Hello!
I’m using VyOS as a virtual switch for 10Gbe connection between workstation and NAS. It is in ESXi alongside OPNsense for general purpose routing.
With direct connection between NAS and WS I can have 1 GB/s during SMB file copying. However with VyOS in between, I can only reach ~580MB/s, before tuning it was ~450MB/s (ethtool -K eth1 rx on tx on sg on tso on gso on gro on lro on ntuple on rxhash on; ethtool -G eth1 tx 4096 rx 4096).
Both NAS and WS uses Mellanox ConnectX-3, VyOS has PCIe passthrough of X520 and VMXNET3 for 1GBe LAN intel link shared with OPNsense. All VyOS interfaces (eth0-2) are tied in singe bridge. I have 9000 MTU set on 10GBe links on each side.
The ESXi spec is not high - G4400, 8GB DDR4, however max CPU usage by VyOS during copying is around 6%. VyOS has assigned 2 vCPUs and 2GB RAM.
Is there anything I can do to improve performance? Currently I don’t know where is bottleneck in 10GBe switching. I know I can always upgrade the hardware, but even current limited resources seems not be fully used.
Most of the time you will be I/O bound not CPU bound, thus just read this file through normal Perl I/O and process it in single thread. Unless you prove that you can do more I/O than your single CPU work, don’t waste your time with anything more. Anyway, you should ask: Why on Earth is this in one huge file? Why on Earth don’t they split it in a reasonable way when they generate it? It would be magnitude more worth work. Then you can put it in separate I/O channels and use more CPU’s (if you don’t use some sort of RAID 0 or NAS or …).
Measure, don’t assume. Don’t forget to flush caches before each test. Remember that serialized I/O is a magnitude faster than random.
Please don’t discount large files. Why not large files? Its 2021, I transfer 80GB files or 300GB often enough.
It depends how you use it, it shouldnt matter if its 5MB or 300GB or more, this is not being helpful.
As a technical exercise I am going to be trying to see what I can get out of the CX-3 cards over 40Gb switch - routed