Which smoke tests are run for public rolling ISOs?

MPStudyly · April 7, 2025, 11:49am

Hey everyone!

Basically as the title says. I’d like to reproduce the builds which are publicly available, but running full smoke tests still fails for me, as described in ⚓ T7215 VRF smoke tests failing. As I assume rolling ISOs wouldn’t be made available if smoke tests failed on them, I suspect I’m either running too many of them, or there is something else fishy going on on my side. As I was unable to figure this out myself, I’d love to get some input of the maintainers on the topic

Best regards

talmakion · April 7, 2025, 12:37pm

Not a maintainer, but I have bumped into similar issues before with VRF smoketests. A lot of the tests around interfaces, routing and VRFs are fussy about their environment.

If you haven’t spotted it, there’s a QEMU setup script that describes the testing target used by the automated builds: vyos-build/scripts/check-qemu-install at current · vyos/vyos-build · GitHub. It has quite a bit of hardware attached and various smoketests will choke if (for eg) you don’t have enough interfaces, or they’re not preconfigured as expected. I do all my testing under Proxmox so I had to recreate some of that manually.

In your ticket, it looks like you already have a VRF called “red” before running the smoketests, which will break them as the first step each VRF smoketest will do is try to create it again with specific parameters. It also cannot clean up after itself, because the VRF was assigned to interfaces it didn’t add. This might be something you created manually, or the leftovers of a previous busted smoketest run - you want to run smoketests from a point as close to stock, default configuration as possible. Underlying state must match config as well for consistent results.

If it’s not something you’ve done and you’re pretty sure you’re starting clean, have a look back further in the smoketest logs to see if anything else choked and left invalid configuration for the VRF section to complain about.

As far as I know, the automated smoketests split interface (test_interfaces_*.py) and non-interface (everything else, including VRFs) tests into separate VM boot-ups. The runs are otherwise more or less the same as a manual run.

MPStudyly · April 7, 2025, 12:50pm

… or the leftovers of a previous busted smoketest run …

That is something I actually missed to consider, at least for tests way before the ones that failed (at least the block right before went through fine). Otherwise I can 100% assure you the environment is squeaky clean. My pipeline starts by freshly pulling the vyos-build repo. Then, inside the build docker container, I build a default image (no customization) and then start tests via sudo make tests. This way (I thought) I can make sure I didn’t miss important config or anything else relevant to get them to pass.

Sincerely, if there is a straight forward way to skip VRF tests only, that’s also something we could consider, as we have 0 reliance on that for now.

Besides this, how are you exactly running the tests? I’d like the idea of leveraging Proxmox to do some of the heavy lifting, without relying on nested virtualisation. And did you actually manage to make all of them pass? How do you start them? Also using make or just replicating what make is triggering with target test?

Apachez · April 7, 2025, 1:07pm

Another thing is that when the nightly is being built that occurs at night in some EU timezone.

After that have been built there can be commits that wasnt properly tested before acknowledged which makes the nightly builds to fail their smoketests.

There is currently such a situation where the latest successful nightly build is from 4th april while today is 7th april:

Back in the days you could see all the failed builds (and by clicking on them get a log of what is failing) over at Workflow runs · vyos/vyos-nightly-build · GitHub but that seems to not reflect current status any longer.

Note that the changelog at above page is also broken so it will claim no changes between two successful builds and then suddently all changes since 1st jan 2025 or something like that making it difficult to see what the changes really are between two successful builds.

There is a pending task about this over at:

(couldnt locate it right now but Im pretty sure I have seen it over at https://vyos.dev)

MPStudyly · April 7, 2025, 1:32pm

After that have been built there can be commits that wasnt properly tested before acknowledged which makes the nightly builds to fail their smoketests.

You mean there are no tags you could refer to when building locally, which could guarantee you actually try to build the same as they did?

Apachez · April 7, 2025, 3:12pm

No idea but by default the docker goes for current and current right now at 17:12 is not necessary the same current which occured last night at 02:00 (or whenever the nightly builds start their compilation).

talmakion · April 8, 2025, 9:50am

Besides this, how are you exactly running the tests?

Manual VM spinup, building manual config to exercise modified code, manually running individual smoketests to validate & test those changes before submitting a PR. I don’t tend to run them all on a regular basis - a full VyOS source build is about 40 minutes, a full test run a couple of hours on my weedy little homelab.

I don’t think my way will help you much for automated build verification.

Like Apachez has mentioned and I hadn’t thought of earlier, you’ll also need to make sure your entire build environment and source trees match whatever was used for -current rolling images at a point in time. There’s a few source repos involved, not to mention upstream Debian packages that feed into the build process. It’s entirely possible that VRF smoketests are just broken right now in -current.

The way the automated job runners appear to disable some tests for specific runs is to simply delete unwanted test scripts before starting the process, which shouldn’t be an issue if you’re slapping down a fresh VM for each run. Anything that excludes tests from discovery will work to disable them. You might even be able to give Nose some arguments for finer-grained control (have a look at the commands run by make test).

MPStudyly · April 14, 2025, 6:21pm

Wow, you go through all these steps for a homelab? Why did you choose against simply spinning up the dedicated VM when necessary?

You might even be able to give Nose some arguments for finer-grained control (have a look at the commands run by make test).

I did exactly that. Had a look at the github automation configs for the vyos-nightly-build repo as well. For now I simply replicate the steps performed there in a simple shell script executing the same make targets within the dedicated VM. The VM is only spawned once for all targets, but they won’t fail for VRF as that’s skipped now. I didn’t let them run again since last week, so I can’t tell if it got fixed now. Will try another run this night.

talmakion · April 15, 2025, 5:18am

Wow, you go through all these steps for a homelab?

Not really - I can, but I don’t as mentioned - it takes ages. Just a bunch of VMs that get current rolling images, inject all the smoketest and dev packages and selectively run tests against changed code. Pulling a new rolling ISO from github releases (if required), pushing it to test VMs along with whatever I want to test takes less than a minute with Ansible’s help. Testing takes a while - but that’s manually exercising configs and underlying to make sure everything works as expected, not waiting for automated tests to work their way through.

After submitting a PR, the VyOS project github automations go through the full suite of smoke and config load tests anyway. Me running tests are so I don’t look too stupid when my pull requests break.

MPStudyly · April 15, 2025, 8:16am

Pulling a new rolling ISO from github releases (if required) …

If that’d be just enough for us It would be great if we could just reuse the published ISOs and inject some required packages, ~~but AFAIK that’s not a thing (yet)~~. Nvm, this was not the sole reason we switched to building ourself

Me running tests are so I don’t look too stupid when my pull requests break.

Yeah, no judgement on why you run tests. I guess no one making use of them is doing that just for fun

Apachez · April 21, 2025, 4:47am

You can.

Afterall VyOS is based on Debian so you can install packages after the install have completed.

Note however if you update the VyOS then the same packages needs to be added again due to how storage works in VyOS (each installed version have its own overlay filesystem using the squashfs from the ISO as base).

MPStudyly · April 22, 2025, 11:04am

Maybe I’ve explained myself wrong. I’m aware you can install .deb packages at any time, but having to reinstall on upgrades is what we want to avoid. That’s why we build our own images

But it would be enough for us to reuse the officially built ISOs, if we could inject packages into that via the build system. That way, we’d avoid the need to run the testsuite ourself (except if we’d expect some interference of custom packages, which we don’t).

I think it is possible to inject custom packages when building a qcow2 image from an ISO, but then we’d have to re-setup the whole VM instead of “just” applying an update.

system · May 22, 2025, 11:04am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.