VyOS build failing smoketests

Hello everyone,

I’m again facing some issues with the smoketests. After fixing an issue with the test-no-interfaces target, I’m left with either broken tests or some broken code for login related stuff. You can check what I get by looking it up in the most recent file uploaded to ⚓ T7215 VRF smoke tests failing, around line 2737, testfile test_system_login.py. Is this a known issue that’s being worked on? Or is this an issue on my side? A quick search didn’t reveal any obvious tickets, so it might be the latter. I’m still in the dark on what actual test targets are used for the nightly builds, as whatever they test must go through just fine. So maybe I’m in the wrong by trying to use test-no-interfaces in the first place. The target frequently breaking due to missing parameter adaptions (such as the fix I pushed) hints toward this target not being in regular use.

What I try to accomplish is building custom ISOs for internal use, based on the state of nightlies (i.e. passing all relevant smoke tests). I’m resorting to building it from scratch, as I had issues with reusing the nightly ISO.

Best regards

Do not see any issues with login:

DEBUG - Running Testcase: /usr/libexec/vyos/tests/smoke/cli/test_system_login.py
DEBUG - test_add_linux_system_user (__main__.TestSystemLogin.test_add_linux_system_user) ... ok
DEBUG - test_delete_current_user (__main__.TestSystemLogin.test_delete_current_user) ... ok
DEBUG - test_pam_nologin (__main__.TestSystemLogin.test_pam_nologin) ...
DEBUG - test_radius_kernel_features (__main__.TestSystemLogin.test_radius_kernel_features) ... ok
DEBUG - test_system_login_max_login_session (__main__.TestSystemLogin.test_system_login_max_login_session) ... ok
DEBUG - test_system_login_otp (__main__.TestSystemLogin.test_system_login_otp) ... ok
DEBUG - test_system_login_radius_ipv4 (__main__.TestSystemLogin.test_system_login_radius_ipv4) ... ok
DEBUG - test_system_login_radius_ipv6 (__main__.TestSystemLogin.test_system_login_radius_ipv6) ... ok
DEBUG - test_system_login_tacacs (__main__.TestSystemLogin.test_system_login_tacacs) ... ok
DEBUG - test_system_login_user (__main__.TestSystemLogin.test_system_login_user) ... ok
DEBUG - test_system_login_weak_password_warning (__main__.TestSystemLogin.test_system_login_weak_password_warning) ... ok
DEBUG - test_system_user_ssh_key (__main__.TestSystemLogin.test_system_user_ssh_key) ... ok
DEBUG - 
DEBUG - ----------------------------------------------------------------------
DEBUG - Ran 12 tests in 103.049s

Build system at least does these steps:

sudo make test-no-interfaces-no-vpp | tee smoketest_make_test_no_interfaces_no_vpp.log
sudo make test-vpp | tee smoketest_make_test_vpp.log
sudo make test-interfaces | tee smoketest_make_test_interfaces.log
sudo make testc | tee smoketest_make_testc.log
sudo make testcvpp | tee smoketest_make_testcvpp.log
sudo make testraid | tee smoketest_make_testraid.log
sudo make testtpm | tee smoketest_make_testtpm.log

(VPP split was added recently)

I’m thinking of extending it with the container lab or robot framework for the topology tests in the containers/VMs; these are only thoughts.

I think you’re looking at the wrong file. Might not have been my best idea to link to that ticket, as it contains the logs multiple test runs :grimacing:

Anyway, here is the output I meant:

DEBUG - Running Testcase: /usr/libexec/vyos/tests/smoke/cli/test_system_login.py
DEBUG - test_add_linux_system_user (__main__.TestSystemLogin.test_add_linux_system_user) ... ok
DEBUG - test_delete_current_user (__main__.TestSystemLogin.test_delete_current_user) ... ok
DEBUG - test_pam_nologin (__main__.TestSystemLogin.test_pam_nologin) ... ERROR
DEBUG - test_radius_kernel_features (__main__.TestSystemLogin.test_radius_kernel_features) ... ok
DEBUG - test_system_login_max_login_session (__main__.TestSystemLogin.test_system_login_max_login_session) ... ok
DEBUG - test_system_login_otp (__main__.TestSystemLogin.test_system_login_otp) ... ok
DEBUG - test_system_login_radius_ipv4 (__main__.TestSystemLogin.test_system_login_radius_ipv4) ... ERROR
DEBUG - test_system_login_radius_ipv6 (__main__.TestSystemLogin.test_system_login_radius_ipv6) ... ERROR
DEBUG - test_system_login_tacacs (__main__.TestSystemLogin.test_system_login_tacacs) ... ERROR
DEBUG - test_system_login_user (__main__.TestSystemLogin.test_system_login_user) ... ok
DEBUG - test_system_login_weak_password_warning (__main__.TestSystemLogin.test_system_login_weak_password_warning) ... ok
DEBUG - test_system_user_ssh_key (__main__.TestSystemLogin.test_system_user_ssh_key) ... ok
DEBUG - 
DEBUG - ======================================================================
DEBUG - ERROR: test_pam_nologin (__main__.TestSystemLogin.test_pam_nologin)
DEBUG - ----------------------------------------------------------------------
DEBUG - Traceback (most recent call last):
DEBUG -   File "/usr/libexec/vyos/tests/smoke/cli/test_system_login.py", line 561, in test_pam_nologin
DEBUG -     out, err = self.ssh_send_cmd(ssh_test_command, username, password)
DEBUG -                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
DEBUG -   File "/usr/libexec/vyos/tests/smoke/cli/base_vyostest_shim.py", line 161, in ssh_send_cmd
DEBUG -     ssh_client.connect(hostname=hostname, username=username,
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/client.py", line 450, in connect
DEBUG -     self._auth(
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/client.py", line 781, in _auth
DEBUG -     raise saved_exception
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/client.py", line 768, in _auth
DEBUG -     self._transport.auth_password(username, password)
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/transport.py", line 1564, in auth_password
DEBUG -     return self.auth_handler.wait_for_response(my_event)
DEBUG -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/auth_handler.py", line 259, in wait_for_response
DEBUG -     raise e
DEBUG - paramiko.ssh_exception.BadAuthenticationType: Bad authentication type; allowed types: ['publickey']
DEBUG - 
DEBUG - ======================================================================
DEBUG - ERROR: test_system_login_radius_ipv4 (__main__.TestSystemLogin.test_system_login_radius_ipv4)
DEBUG - ----------------------------------------------------------------------
DEBUG - Traceback (most recent call last):
DEBUG -   File "/usr/libexec/vyos/tests/smoke/cli/test_system_login.py", line 296, in test_system_login_radius_ipv4
DEBUG -     self._system_login_radius_test_helper(radius_servers, radius_source)
DEBUG -   File "/usr/libexec/vyos/tests/smoke/cli/test_system_login.py", line 405, in _system_login_radius_test_helper
DEBUG -     out, err = self.ssh_send_cmd(ssh_test_command, username, password)
DEBUG -                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
DEBUG -   File "/usr/libexec/vyos/tests/smoke/cli/base_vyostest_shim.py", line 161, in ssh_send_cmd
DEBUG -     ssh_client.connect(hostname=hostname, username=username,
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/client.py", line 450, in connect
DEBUG -     self._auth(
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/client.py", line 781, in _auth
DEBUG -     raise saved_exception
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/client.py", line 768, in _auth
DEBUG -     self._transport.auth_password(username, password)
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/transport.py", line 1564, in auth_password
DEBUG -     return self.auth_handler.wait_for_response(my_event)
DEBUG -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/auth_handler.py", line 259, in wait_for_response
DEBUG -     raise e
DEBUG - paramiko.ssh_exception.BadAuthenticationType: Bad authentication type; allowed types: ['publickey']
DEBUG - 
DEBUG - ======================================================================
DEBUG - ERROR: test_system_login_radius_ipv6 (__main__.TestSystemLogin.test_system_login_radius_ipv6)
DEBUG - ----------------------------------------------------------------------
DEBUG - Traceback (most recent call last):
DEBUG -   File "/usr/libexec/vyos/tests/smoke/cli/test_system_login.py", line 301, in test_system_login_radius_ipv6
DEBUG -     self._system_login_radius_test_helper(radius_servers, radius_source)
DEBUG -   File "/usr/libexec/vyos/tests/smoke/cli/test_system_login.py", line 405, in _system_login_radius_test_helper
DEBUG -     out, err = self.ssh_send_cmd(ssh_test_command, username, password)
DEBUG -                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
DEBUG -   File "/usr/libexec/vyos/tests/smoke/cli/base_vyostest_shim.py", line 161, in ssh_send_cmd
DEBUG -     ssh_client.connect(hostname=hostname, username=username,
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/client.py", line 450, in connect
DEBUG -     self._auth(
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/client.py", line 781, in _auth
DEBUG -     raise saved_exception
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/client.py", line 768, in _auth
DEBUG -     self._transport.auth_password(username, password)
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/transport.py", line 1564, in auth_password
DEBUG -     return self.auth_handler.wait_for_response(my_event)
DEBUG -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/auth_handler.py", line 259, in wait_for_response
DEBUG -     raise e
DEBUG - paramiko.ssh_exception.BadAuthenticationType: Bad authentication type; allowed types: ['publickey']
DEBUG - 
DEBUG - ======================================================================
DEBUG - ERROR: test_system_login_tacacs (__main__.TestSystemLogin.test_system_login_tacacs)
DEBUG - ----------------------------------------------------------------------
DEBUG - Traceback (most recent call last):
DEBUG -   File "/usr/libexec/vyos/tests/smoke/cli/test_system_login.py", line 522, in test_system_login_tacacs
DEBUG -     out, err = self.ssh_send_cmd(ssh_test_command, username, password)
DEBUG -                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
DEBUG -   File "/usr/libexec/vyos/tests/smoke/cli/base_vyostest_shim.py", line 161, in ssh_send_cmd
DEBUG -     ssh_client.connect(hostname=hostname, username=username,
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/client.py", line 450, in connect
DEBUG -     self._auth(
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/client.py", line 781, in _auth
DEBUG -     raise saved_exception
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/client.py", line 768, in _auth
DEBUG -     self._transport.auth_password(username, password)
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/transport.py", line 1564, in auth_password
DEBUG -     return self.auth_handler.wait_for_response(my_event)
DEBUG -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
DEBUG -   File "/usr/lib/python3/dist-packages/paramiko/auth_handler.py", line 259, in wait_for_response
DEBUG -     raise e
DEBUG - paramiko.ssh_exception.BadAuthenticationType: Bad authentication type; allowed types: ['publickey']
DEBUG - 
DEBUG - ----------------------------------------------------------------------
DEBUG - Ran 12 tests in 69.508s

I just gave it another shot, still fails on these three with freshly pulled repos.

Build system at least does these steps

Thank you a lot! So I after all wasn’t that far off with what I’m doing, though test-no-interfaces is actually not in use. How is that? What is the advantage of calling each test separately? Especially as AFAIR test-no-interfaces-no-vpp should cover e.g. test-c (not sure right now, always aborted after I encountered the first failure), so those tests run twice?

I’m thinking of extending it with the container lab or robot framework for the topology tests in the containers/VMs; these are only thoughts.

Sounds like an interesting approach! Though I’m not that familiar with these solutions. The closest I came was some barely automated gns3 setup, that got thrown away after not being that useful.

Check my log, the test name is the same.

You can start them in parallel and decrease the test time.
Also, it is convenient to check the VPP PHONY only without other tests. This is great and saves a lot of time for VPP development. But you can use test-no-interface if you do not need parallel execution.

Ah, so I’ve got you wrong. This is the output on your end? Which target do you call to execute the test? Or do you call that file only? If the latter, could you give me the necessary command to run that single file, so I can try that as well. Besides some more unknown shenanigans with test-no-interfaces I have no idea why the test could fail on my side, but succeed at yours.

You can start them in parallel and decrease the test time.

I feel kinda stupid for not seeing the obvious. That makes a lot of sense :sweat_smile:

Also, it is convenient to check the VPP PHONY only without other tests. This is great and saves a lot of time for VPP development. But you can use test-no-interface if you do not need parallel execution.

Yeah, that came into my mind already and is perfectly sensible as well. I was just wondering about having the “test everything” target but it seemed to be used nowhere. Thinking about it, would you accept a PR that changes the test-no-interfaces target to make use of the other existing targets? That way these accidental breakages I fixed could be avoided.

I still can’t get these three login tests to pass. I made sure to delete my local clone of the build repo and start from scratch. I’m using all targets you mentioned above, all pass except test-no-interface-no-vpp. See the attached logfile of the run. I’m sincerely out of ideas what could be wrong on my side.

test-no-interfaces-no-vpp.log (214,0 KB)

Try official published release and execute

/usr/libexec/vyos/tests/smoke/cli/test_system_login.py
2 Likes

Another thing I find useful is you can do this to call just an indiviudal test:

python3 /usr/libexec/vyos/tests/smoke/cli/test_system_login.py -k test_pam_nologin
3 Likes

I’ll do so ASAP next week!

EDIT: Unfortunately it doesn’t look like I’ll be able to re-test it before my vacation. If that’s the case, don’t expect a message from me until the last week of August :confused: On the other hand, until then it might have fixed itself xD

1 Like

Well, it took another two weeks to get enough things done to return to this. :sweat_smile: I just re-ran my setup, resulting in the same issue as described in my last post. So the issue definetly didn’t fix itself. Next step is trying what you suggested @Viacheslav, downloading the official image and running these specific tests only. I’d just need some input on how to tackle this. Is it enough to mount the official ISO into the build container and exectue the tests or am I supposed, to spin up a VM with that ISO? I just quickly checked the build container and was unable to find the tests in there (which makes sense I guess). So I’d be happy about any input.

Once I have the results, I’ll update this post.

@Viacheslav sorry to bother you again, but I’d need some help with making use of the publicly available ISO. How am I supposed to use that instead the one I’ve built myself? Where do I put it to run the tests based on it? I assume I’d need a raw VM disk with VyOS installed from it, right?

I figured it out, it was enough to place it in the regular build output, rename it to live-image-amd64.hybrid.iso and make sure the manifest.json actually containes the version that lsb_release --short --id will return on the official release. Without this the initialisation of tests will fail due to a mismatch.

Anyway, let’s get to the interesting parts: With the official builds, the problematic tests pass. As it’s been a while since I ran the tests on a current custom build, I will redo that now and report back if the error is solved there as well.

EDIT: I’m seeing the same issue as before, the tests that failed in July also fail now. As the official build worked, I suspect something with building the ISO is off. Any info I can provide you with to debug this further? @Viacheslav

EDIT2: I just ran test_system_login.py in one of our VyOS VMs in our test env and it failed there as well. So it should be pretty clear the failure is not caused by having the test run in the KVM spun up by the build container. They fail with a TimeoutError: [Errno 110] Connection timed out instead of paramiko.ssh_exception.BadAuthenticationType: Bad authentication type; allowed types: [‘publickey’] though. Not sure if this makes it worse or better, as either the tests are broken under certain circumstances or the actual features are broken on my installations.

You can completely remove the manifest.json

CI does all smoketests before publishing an image. Therefore, if there is at least one smoke test failure, the image will not be published.

So your changes do not look good or require some extra steps for smoketsts.

You can completely remove the manifest.json

Oh, that’s a good info to know. Though I doubt that was the curlprit. I’m aware the images are not published when CI fails, I’m just trying to replicate that. My procedure looks as follows:

  1. Fetch vyos_build repo into a clean directory
  2. Copy a shell script with insctuctions to build a generic image and run all tests your CI is running in parallel
  3. Execute given shell script
  4. Wait for results
  5. (Build custom image, basically generic + custom prometheus exporter) - We never get here, as smoketests never passed

I can’t spot anything obvious I’m doing wrong in my side. Especially as I’m starting with a generic build and explicitly do not run smoketests on our custom one. I’m getting really lost on this :frowning:

Would you mind trying a generic build + smoke tests on your local machine as well? test-no-interfaces-no-vpp is the problematic one, just as before.

This issue now is going on that long I forgot you already provided logs from a local run of yours. I read through the whole thread again, and the only thing I was unable to look up was in what environment and how exactly you start the tests. Are you doing it within the vyos-build container as well? Do you trigger the tests via Makefile as well, or do you setup a VyOS test machine manually and execute the tests by hand? I tried both on my end, both fail for the self build image, which is 1:1 what is being described in the docs about building yourself. I’m now investigating the actual tests themself, because there must be something missing (or maybe it’s just me missing it). I can’t immagine no one else caring about running smoketests on self built images and just missing this errors by chance.

I hope this is the final update on this Odyssee: I found the curlprit!

We were using a slightly modified config.boot.default, as we had to specify one explicitly to work around this now fixed issue. It contained ssh config to disable passworld login. This made these four tests fail, as they rely on password logins. The actual failure mode paramiko.ssh_exception.BadAuthenticationType: Bad authentication type; allowed types: ['publickey'] made me suspicious from the beginning, but I shouldn’t have assumed something with the keypairs is off, but check the test code right away.

I hope this is the final issue I’ve had with smoketests :sweat_smile: