Performance issue with a 10GE interface / i40e driver

Hey,

I was testing the performance of VyOS on a MS-A2 (Minisforum - 16 cores Ryzen CPU - 2 10GE port on Intel X710) using the 2x 10GE interfaces of this box connected to an Ixia IxNetwork traffic gen. I’m just running plain IPv4 routing between those 2 interfaces.

VyOS is directly loaded on the MS-A2 server (no supervisor engine below - bare metal).

I increased to rx / tx ring to their max at 8160.

No loss at 1.5 Gbps.

But at 2 Gbps I loose 5% of traffic. Confirmed on VyOS with an increase of the rx_missed_errors.

I would have expected a larger throughput.

I was running VyOS 2026.01.08-0022-rolling.

Any idea why I get this limitation ?

Thanks, Damien.

Can you share a bit of config? Have you enabled any offloading?

Hi,

Sure. I left the offload flags configured by default on each of the interfaces (eth2 and eth3 are the 2 x 10GE interfaces used for this test).

Configuration extract:
 
ethernet eth2 {
    hw-id "38:05:25:33:8d:57"
    offload {
        gro
        gso
        sg
        tso
    }
    ring-buffer {
        rx "8160"
        tx "8160"
    }
    vif 407 {
        address "192.168.10.6/29"
    }
}
ethernet eth3 {
    hw-id "38:05:25:33:8d:58"
    offload {
        gro
        gso
        sg
        tso
    }
    ring-buffer {
        rx "8160"
        tx "8159"
    }
    vif 403 {
        address "192.168.20.5/29"
    }
    vif 408 {
        address "192.168.30.6/29"
    }
}
 
 
 
vyos@mini2:~$ show interfaces ethernet eth2 physical
Settings for eth2:
        Supported ports: [ FIBRE ]
        Supported link modes:   1000baseX/Full
                                10000baseSR/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  1000baseX/Full
                                10000baseSR/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Speed: 10000Mb/s
        Duplex: Full
        Auto-negotiation: off
        Port: FIBRE
        PHYAD: 0
        Transceiver: internal
        Supports Wake-on: g
        Wake-on: g
        Current message level: 0x00000007 (7)
                               drv probe link
        Link detected: yes
Ring parameters for eth2:
Pre-set maximums:
RX:                     8160
RX Mini:                n/a
RX Jumbo:               n/a
TX:                     8160
TX push buff len:       n/a
Current hardware settings:
RX:                     8160
RX Mini:                n/a
RX Jumbo:               n/a
TX:                     8160
RX Buf Len:             n/a
CQE Size:               n/a
TX Push:                off
RX Push:                off
TX push buff len:       n/a
TCP data split:         n/a
driver: i40e
version: 6.6.117-vyos
firmware-version: 9.20 0x8000d8c5 1.3602.0
expansion-rom-version:
bus-info: 0000:05:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
 
vyos@mini2:~$ show interfaces ethernet eth2 statistics | grep -v "0$"
NIC statistics:
     tx_packets: 1639231435
     rx_bytes: 1004563042088
     rx_missed_errors: 163523313
     rx_unicast: 2625239732
     tx_multicast: 28
     rx_broadcast: 11
     tx_broadcast: 24
     rx_cache_reuse: 2461973232
     rx_cache_alloc: 538528
     tx-0.packets: 6
     tx-0.bytes: 252
     tx-1.packets: 34
     tx-1.bytes: 1672
     tx-2.packets: 18
     tx-2.bytes: 756
     tx-3.packets: 3
     tx-3.bytes: 262
     tx-4.packets: 3
     tx-4.bytes: 126
     tx-5.packets: 12
     tx-5.bytes: 1032
     tx-6.packets: 1639231333
     tx-6.bytes: 654153867568
     rx-10.packets: 2
     rx-10.bytes: 196
     tx-16.packets: 4
     tx-17.packets: 4
     tx-17.bytes: 384
     tx-18.bytes: 1132
     tx-19.packets: 1
     tx-22.packets: 5
     tx-22.bytes: 526
     rx-22.packets: 102
     rx-26.packets: 2461716326
     rx-26.bytes: 1004563035772
     tx-28.packets: 2
     port.rx_bytes: 1082152524035
     port.tx_bytes: 667268181415
     port.rx_unicast: 2625239732
     port.rx_multicast: 86674
     port.tx_multicast: 5248
     port.rx_broadcast: 11
     port.tx_broadcast: 24
     port.mac_local_faults: 1
     port.mac_remote_faults: 2
     port.rx_size_64: 5302
     port.rx_size_127: 1437612807
     port.rx_size_255: 110591705
     port.rx_size_1522: 552897053
     port.tx_size_64: 122
     port.tx_size_127: 851034915
     port.tx_size_255: 65474884
     port.tx_size_511: 328918986
     port.tx_size_1023: 65522593
     port.tx_size_1522: 328285152
     port.fdir_atr_status: 1
     port.fdir_sb_status: 1

Thanks, Damien.

It would be helpful to know more details about the test as well. When you’re doing forwarding through the kernel slow path the pps is going be your biggest limiting factor. If you’re flooding it with a bunch if small packets that can’t be aggregated by any offloads your CPU is going to have a bad day even with a low volume.

Also which CPU is it?

“AMD Ryzen” doesnt really mean anything since there are both slow and very fast Ryzens out there.

For a softwarebased router (and firewall) I would prefer fewer but faster cores rather than the opposite for a given budget and TDP.

You should also consider disabling HT/SMT so that each core (and by that thread) gets to use all the available cache (which otherwise is shared with that other thread running on the same core).

That is because for a single TCP/UDP-session all the packets will flow through a single CPU core so the packets wont get out-of-order when leaving the box. So singlecore performance is more critical than multicore performance.

Which also boils down to how did you do the tests? Saying “I used ixia ixnetworks” doesnt really mean anything. Did you use a singlestream or multistream test? Did you use various packetsize? Were you using IMIX or not etc?

Also please stop using old releases when reporting on issues.

There is at least a rolling from 2026-04-30 you should try to begin with (they are built every night unless some smoketests stops it from being published):

Or try the latest stream edition over at (version 2026.3 as of writing):

While at it if you got more than a few cores and the NIC’s being used are supported you should try to enable VPP in VyOS to squeeze out all performance there are with your hardware:

Hi,

The processor is an AMD Ryzen 9 9955HX (16 cores) with HT currently activated.

The iMix is configured with 5 different packet sizes and it produces around 770 000 PPS at 1.5 Gbps (average packet size 400 bytes).

The traffic we are routing is composed of a single IPSEC/UDP session (bi-directionnal traffic) so yes poor traffic diversity. From what I read from top the traffic seems to be treated by one core only:

top - 13:32:25 up 5 days, 27 min,  1 user,  load average: 1.03, 1.02, 0.88
Tasks: 356 total,   3 running, 352 sleeping,   0 stopped,   1 zombie
%Cpu0  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
/../
%Cpu9  :  0.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,100.0 si,  0.0 st
/.../

Only Cpu9 is saturated. The other 32 threads are idle at 100%.

VPP may be interesting but I read it does not support my Intel X710 adapters.

So I guess there is no way for me to use all my CPUs with such a limited traffic diversity ?

Thanks, Damien.

One possible workaround to test VPP would perhaps be to run VyOS as a VM-guest using virtio as NIC drivers?

Also Im not sure if the NIC list over at VPP Dataplane Requirements — VyOS rolling release (current) is up2date.

You can always try the X710 anyway using set vpp settings allow-unsupported-nics

According to this Primary Characteristics of FD.io VPP — Vector Packet Processor 0.1 documentation the i40e driver is supported by VPP and that is what the Intel X710 NICs are using.

Other than that a single tcp/udp-session aka 5 tuple (protocol + srcip + dstip + srcport + dstport) will end up in the same thread to avoid out-of-order packets.

So do you see any improvement if you disable HT/SMT without changing anything else?