Vyos + apu2 not hitting 1gb/s

The PC Engine APU2 is an SBC with an AMD SoC running at a max of 1 Ghz - 12 watts. They run Intel NICs but it wasn’t a great surprise the thing needed some tuning to get full 1gb/s speed out of it.

https://www.amd.com/en/system/files?file=2017-06/g-series-soc-product-brief.pdf

Super nice boards;

Oh Ok that’s why! :wink:
I don’t know how much is an APU2 but a Mini PC is about 200 USD

It’s about the same :rofl:

I am looking at getting a Protectli box, but believe this is “just” a rebadged qotom machine.

Feel free to ping me if you have any recommendations for a good low power, silent box. Ideally I would want something that can take a 10gb/s sfp module as that is what comes into my apartment…

I also got an APU2 board, not sure if you got following settings allready (do this for all NICs) and reach somewhere around 920Mb/s.

ethtool -G ethX tx 4096 rx 4096
ethtool -K ethX tx on sg on tso on gso on gro on lro off

  1. line set ringbufers to max.
  2. line turn on offloading

I put the commands in /config/scripts/vyos-postconfig-bootup.script which is run at boot.
And don’t have mitigation=off set.

2 Likes

thanks a lot @dannyw

will give those a go first thing.

Enabled mitigations again and implemented the changes @dannyw shared. getting over 900mb/s.

I will stick with that configuration.

@Dmitry, I can update the wiki with such information if you think this would be generally useful.

Intel NIC offloading may not be a good idea on a router, or so I have read somewhere.
Ring buffers - yes, rx 256 is too small resulting in lots of FIFO errors and dropped packets, rx 4096 works better.
Looking at this thread as I’m trying to get close to 1 Gb/s throughput on APU4 as BGP routers for small local ISP I’m running here. Unfortunately the APU4 has i211 NICs with only 2 queues. I’ve recently had a ~200k pps incoming DDoS which maxed out 2 of 4 CPU cores, looking to improve this - or do I really need to get faster (but more power-hungry) server boards to work well for this use?
I just need pure routing, no NAT etc. I wish there was an easy to setup router distro with something faster like DPDK but it seems it’s not there yet.

@marekm72

hey there, are you able to add more information about why offloading isn’t a good idea or have a link that explains more?

looks like your topic might deserve its own thread to get a better and more specific response.

Only LRO offloading shouldn’t be used on a router according to this:

Do Not Use LRO When Routing Packets

Due to a known general compatibility issue with LRO and routing, do not use LRO
when routing packets.

On my apu4 board I was able to reduce CPU usage during download from 100% to <50% just by using

sudo ethtool -K ethX tx on sg on tso on gso on gro on lro off

LRO is not used as you can see.

I still don’t get full 1Gbps, but getting closer

 Download:   863.05 Mbps (data used: 1.0 GB)
 Upload:   938.31 Mbps (data used: 644.7 MB)

The problem with apu4 is interrupts don’t seem to distribute evenly over the 2 available CPUs but only for RX. TX looks fine:

            CPU0       CPU1       CPU2       CPU3
  56:          0          0          0     883331   PCI-MSI 2097153-edge      eth3-rx-0
  57:       1407          0          0          0   PCI-MSI 2097154-edge      eth3-rx-1
  58:          0     959698          0          0   PCI-MSI 2097155-edge      eth3-tx-0
  59:          0          0     946593          0   PCI-MSI 2097156-edge      eth3-tx-1

Hey @phillipmcmahon,

I’ve been struggling with a big performance regression when updating from 1.2 to 1.3.

With the mitigations off, could you please share your output for:
grep . /sys/devices/system/cpu/vulnerabilities/*
I would like to compare it with my device to understand if the mitigations are actually off.

Thank you

Hi there, I no longer have vyos running on my apu, I moved to a protectli device running esxi so I could consolidate a bunch of stuff.

I will resurrect the unit this week, validate it’s doing 1gb and share my config. I do think getting it to 1gbs is absolutely at it’s peak and it’s not a great surprise to see it fluctuate due to this.

Thanks.

Well, in my case with Vyos 1.2 I could do 1Gbit/s easily, with having 2 cores at around 90% usage and the other 2 at around 50%.
Now with Vyos 1.3, it’s maxing out at 700Mbit/s and all cores maxed out.
It’s almost a 50% regression, which doesn’t make sense from a version upgrade.

Fast forward a year or two later, and it was time to go for 10 Gbps interfaces. Also, the APU has some stability issues under high load - I’ve seen unexplained random reboots. So it will have to find some other light use (one with a small hardware hack works as a GPS synced NTP server, with very good accuracy of hardware timestamping by the Intel NIC), and my new router platform of choice are Supermicro X9 series boards with Xeon E3-1220v2 CPUs and Chelsio T440-CR (quad SFP+) PCIe cards. It seems a reasonable compromise between electricity costs, hardware costs and performance. The hardware is about 10 years old (needs BIOS upgrade to fix a year 2021 bug), but routing about 1 Gbps of traffic over two 10 Gb interfaces is hardly visible as any CPU load (about 2-3%).

On my APU2E4 (with 3xi210AT NIC) running 1.3.1-S1 I easily get 1Gbps. Make sure you have an APU2E4 with i210AT NIC, not i211AT (like an APU2E5 or similar, see https://www.pcengines.ch/apu2e4.htm). The i211AT only has 2 queues per port, the i210AT has 4 queues per port. 4 transmit/receive queues per port will distribute interrupts better over all 4 CPU cores.

https://www.speedtest.net/result/c/7c8abcde-a80c-457f-af44-2295e147fb7c