Quick and Dirty Benchmark of Cores Vs MHz

Hello,

I’m a lover of zone based firewalls, and I like to be very granular for security reasons, hence lots of rules…

We have a client on 1.3.5 with a bunch of zones and rules, the config file is about 34,000 lines of mostly firewall rules. NAT is handled by another router.

Their hardware is an old Dell R220 with a quad core Xeon @ 3.xGhz… The time it takes to go from “Mounting VyOS Config…done.” to an actual login prompt and a usable system is about 30 minutes.

Our of curiosity I wanted to see if it was a CPU thing and if it was core or MHz limited.

On my work laptop I have a Ryzen 5, I spun up a virtual guest and loaded the clients config… this is a rough benchmark because I was doing other things on the laptop but I think the results are still revealing.

@6 cores, 4GB RAM the load time was 22 minutes
@4 cores, 4GB RAM the load time was 23 minutes
@2 cores, 4GB RAM the load time was 25 minutes

so it all seems the same no matter the core count…

Then I dropped the CPU frequency from the average of 3.5GHz to 1GHz

@2 cores (1GHz), 4GB RAM the load time was 88 minutes and it brought me to the login prompt but it also said “migrate rl-system firewall failed”

So… the cores didn’t seem to make a real difference for loading all those firewall rules. Going from 1GHz to 3.5GHz made it load 3.5x faster… so here frequency on loading firewall rules seems king.

Anyway, I’m getting over a cold and am putzing about so ran this quick test (we’ll it took 4 hours to complete), and wanted to put it out there for all those with the same unanswered question. I ordered some used higher frequency CPUs and we’ll see what they do :slight_smile:

Next to see what 1.4 does… and thank you again for keeping zone based firewalls, if it had been dropped my confidence in my systems would have dropped.

There is this task reported over at vyos.dev:

Something is fishy with commit and boot times when more than a few hundred static routes are being used
https://vyos.dev/T5388

The sad part is that if you load the generated (once the device have booted) FRR config manually to FRR it goes in like less than a second, same with routes and nftables rules injected to the kernel directly from bash through a script. So there are room for improvement.

Hopefully this can get fixed in future…

Yes, hopefully, but until then telling the staff that it takes 30 minutes for the router to restart makes them both fear it and respect the hands that touch it… or so I like to imagine :slight_smile:

In reality I fear having to reboot it during the day but that only ever happens during a power outage. VyOS has been stable for me. I would like to add more rules/etc though so we’ll see what a faster CPU affords me.

1 Like

Unfortunately this is a dealbreaker for us to have VyOS anywhere close to production.

Waiting for an hour or two for a router to complete its boot its not an option.

@Apachez Just read your thorough testing and tweaking in ⚓ T5388 Something is fishy with commit and boot times when more than a few hundred static routes are being used. Nice work.

It seems to me that most cpu cycles are spent in userspace. E.g. the almost linear increase in commit time when for example adding zones to the zbf.

I agree with you that this is worrisome.

1 Like

If possible please try this as root (sudo bash) after all 34000 or so rules have been loaded:

# Export current ruleset.
sudo nft -s list ruleset > /config/ruleset.txt

# Add "flush ruleset" at top of the dump otherwise import will fail.
sudo sed -i '1s/^/flush ruleset\n\n/' /config/ruleset.txt

# Modify the exported ruleset...
*** modify the /config/ruleset.txt file ***

# Import the modified ruleset.
sudo nft -o -f /config/ruleset.txt

That is export, add “flush ruleset” and then import it again.

Please do this using “time” to get both kernel and userland time spent.

Would be interresting to see how much time it takes for the Linux kernel itself to process the nftables rules from a backup vs how long time VyOS configd spends doing the same thing.

For the import I had to omit -o as it said it wasn’t an option. This is the command I used

time nft -f /config/ruleset.txt
#and 
time sudo nft -f /config/ruleset.txt

The import was instantaneous so I feel I’m missing something, apologies.

This is VyOS 1.3.5, the upgrade to 1.4 failed over multiple attempts so that is on the back burner for now.

I missed that you were using 1.3.x.

The -o option is to “optimize” when importing rulesets, without the -o the ruleset will be exactly the same as when you first dumped it with “-s list ruleset”.

What did “time” say regarding kernel and user space time to perform the import operation?

real     0m0.077s
user     0m0.054s
sys      0m0.022s

Thanks!

Yeah thats some difference between 25 minutes when runned through vyos-configd vs. runned with userland app nft which ended at 0.077 seconds.

1948051% or something like that as difference in performance…

Got around to testing a new processor with the same VyOS config.

So in summary from the previous posts, a timer was started when booting VyOS stopped at “Mounting VyOS config…” (loading what I assume is the multitude of firewall rules I have)

What I found is that:
-it’s a single core process, adding or removing cores doesn’t affect the load times significantly
-on the same CPU, an increase or a decrease in MHz appears to affect the load times linearly

Now for an update to the quick and dirty benchmarks.

The original system: Dell R220 with an E3-1270 v3 @ 3.50GHz with VyOS running bare metal. It takes about 30 minutes to load. I think this is with Turboboost disabled so it stays at 3.50GHz and doesn’t boost to 3.90Ghz.

The 2nd test system was a laptop with an AMD Ryzen 5 PRO 5650U with VyOS running as a virtual guest. Running an average of 3.5Ghz, load time was about 23 minutes, running at 1GHz it was 88 minutes.

The 3rd test system is a Dell R250 with a E-2356G @ 3.20GHz with VyOS running as a virtual guest. With turboboost enabled running with 1 or 4 cores the core in use stayed at 4.8GHz and load time was about 15 minutes (with either 1 or 4 cores). Disabling Turboboost so it stayed at 3.20GHz load time was about 20 minutes.

So going from an older 3.5GHz Xeon to a newer 3.2GHz Xeon took off 33% of the load time, with Turboboost enabled on the newer Xeon @4.8GHz it cut the load time in half (as compared to the older Xeon).

A new thing I learned is to look at the OS DBPM settings on Dell servers. On the Dell R250, with the profile set to OS DBPM in the BIOS, it disables Turboboost. I always thought OS DBPM let the OS handle everything, but apparently in this case it disables Turboboost.

Anyway, just an update to show what load times look like across some CPU architectures. Thank you.

3 Likes

Just an update, the same test was run on bare metal for the Dell R250 with Turboboost enabled and it took about 15 minutes to load, so it seems virtualization didn’t add or subtract from the load times, which is nice.

Another update to satisfy my curiosity. I use Edgerouters often for small stuff and since VyOS and EdgeOS share the same roots I thought I’d try the same config (some stuff had to be modified) on an Edgerouter Infinity (the fastest current model with 8 SFP+ ports).

Loading the config took 1 hour 35 minutes vs the 15 minutes VyOS took on the new Dell entry level server (or 30 minutes for the older Dell entry level server).

For traffic processing I’m sure the Edgerouter hardware (with offload) has higher max throughput than the entry level Dells but since the Dell with VyOS can handle 10Gbit+ easily I’m more concerned with load times for large firewall configs and here the Dell/VyOS combo is 600% faster (15 minutes vs 95 minutes…).

EdgeOS was forked way back about (give or take) the same time as when VyOS was creafed (perhaps VyOS is a year or two younger) with the main difference that it added support for MIPS as mgmt-cpu (if I recall it correctly).

But the long commit/boot time even with EdgeOS shows that there is a flaw in the design of how the config is handled since the Vyatta days. The difference of 1h35m vs 15min is probably due to the performance difference between the MIPS CPU that sits in the Edgerouter box vs the 4.8GHz E-2356G Intel CPU you had in your Dell R250.

Whats interresting is when you for example boot the VyOS with no entires as firewall the boot goes in less than a minute and then you can add thousands of firewall rules in less than one second through “nft -o -f” but letting vyos-configd do the same will take half an hour or more.