Error with containers switching between 1.4 and 1.3

mathias · February 15, 2023, 12:32pm

Hi,

I’ve started playing with the latest Sagitta image (1.4-rolling-202302110748) and deployed 2 containers successfully. Switching to a self-built 1.3 (build 20230214) image with container support I’m seeing the following error:

vyos@vyos:~$ show container
ERRO[0000] User-selected graph driver “overlay2” overwritten by graph driver “overlay” from database - delete libpod local files to resolve
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9e39684cb52e docker.io/ntop/ntopng:stable 15 hours ago Up 12 hours ago ntopng
23c79ffb3bb7 pihole 15 hours ago Up 15 hours ago pihole

This doesn’t seem to affect the containers themselves, all works fine.

However when creating a new container the commit fails:

Traceback (most recent call last):
File “/usr/libexec/vyos/conf_mode/container.py”, line 401, in
generate(c)
File “/usr/libexec/vyos/conf_mode/container.py”, line 343, in generate
run_args = generate_run_arguments(name, container_config)
File “/usr/libexec/vyos/conf_mode/container.py”, line 233, in generate_run_arguments
memory = container_config[‘memory’]
KeyError: ‘memory’

noteworthy:
cmd ‘podman image exists ntop/ntopng:stable’
returned (out):

returned (err):
time=“2023-02-15T12:16:18Z” level=error msg=“User-selected graph driver "overlay2" overwritten by graph driver "overlay" from database - delete libpod local files to resolve”
cmd ‘podman image exists pihole’
returned (out):

returned (err):
time=“2023-02-15T12:16:18Z” level=error msg=“User-selected graph driver "overlay2" overwritten by graph driver "overlay" from database - delete libpod local files to resolve”
cmd ‘podman image exists tailscale/tailscale’
returned (out):

returned (err):
time=“2023-02-15T12:16:18Z” level=error msg=“User-selected graph driver "overlay2" overwritten by graph driver "overlay" from database - delete libpod local files to resolve”

[[container]] failed
Commit failed

Switching back to Sagitta resulted in 1 key error and multiple warnings:

time=“2023-02-15T12:24:11Z” level=warning msg=“Switching default driver from overlay2 to the equivalent overlay driver”

The error was a KeyError:

Traceback (most recent call last):
File “/usr/libexec/vyos/conf_mode/container.py”, line 401, in
generate(c)
File “/usr/libexec/vyos/conf_mode/container.py”, line 343, in generate
run_args = generate_run_arguments(name, container_config)
File “/usr/libexec/vyos/conf_mode/container.py”, line 233, in generate_run_arguments
memory = container_config[‘memory’]
KeyError: ‘memory’

After adding memory, similar error happened with shared_memory and restart. After adding all 3 keys the container was created successfully.
I’m 100% certain I didn’t have to add those keys when creating the 2 initial containers.

I’m not super-familiar with VyOS yet but I thought except /config everything else is in the image so those 2 images shouldn’t affect each other?
Is there a need to reset the container “cache” or something similar and if yes, how to do so without reinstalling the whole system?

Viacheslav · February 15, 2023, 5:29pm

At least with busybox don’t have such issues

set container name busybox allow-host-networks
set container name busybox image 'busybox'

show container:

vyos@r1# run show container 
CONTAINER ID  IMAGE                             COMMAND  CREATED        STATUS            PORTS   NAMES
fb407c1580fd  docker.io/library/busybox:latest  sh       4 minutes ago  Up 4 minutes ago          busybox
[edit]
vyos@r1#

Did you downgrade 1.4 to 1.3 or 1.3 installs from scratch?
Could you share the container configuration?

mathias · February 15, 2023, 6:23pm

I added the 1.3 image and set default boot environment to it - I assume that’s what you mean with downgrade? If that’s not supported I apologize, I thought switching the image is basically replacing the whole system except /config.

Here is the container configuration:

set container name ntopng allow-host-networks
set container name ntopng cap-add ‘net-raw’
set container name ntopng cap-add ‘net-admin’
set container name ntopng image ‘ntop/ntopng:stable’
set container name pihole allow-host-networks
set container name pihole cap-add ‘net-admin’
set container name pihole cap-add ‘net-raw’
set container name pihole environment WEBPASSWORD value ‘vyos’
set container name pihole image ‘pihole’
set container name pihole memory ‘1024’
set container name pihole restart ‘always’
set container name pihole volume pihole_dnsmasq destination ‘/etc/dnsmasq.d/’
set container name pihole volume pihole_dnsmasq source ‘/config/docker/pihole/dnsmasq.d/’
set container name pihole volume pihole_etc destination ‘/etc/pihole/’
set container name pihole volume pihole_etc source ‘/config/docker/pihole/etc/’
set container name tailscale allow-host-networks
set container name tailscale cap-add ‘net-admin’
set container name tailscale cap-add ‘net-raw’
set container name tailscale image ‘tailscale/tailscale’
set container name tailscale memory ‘1024’
set container name tailscale restart ‘on-failure’
set container name tailscale shared-memory ‘128’
set container name tailscale volume devtun destination ‘/dev/net/tun’
set container name tailscale volume devtun source ‘/dev/net/tun’

edan · February 15, 2023, 9:16pm

I’ve had container issues with newer 1.4 rollings too. I ran containers fine for a long time with a 1.4 from late 2021 early 2022 without issue, but since trying more recent 1.4 rolling versions they would not start. I just upgraded one of my Vyos nodes to VyOS 1.4-rolling-202302120317, and I am trying to get my containers running again.

With this config:

set container name newrelic-vyos allow-host-networks
set container name newrelic-vyos environment NRIA_LICENSE_KEY value 'XXXXXXXXXXXXXXX'
set container name newrelic-vyos image 'newrelic/infrastructure:latest'
set container name newrelic-vyos volume logging.d destination '/etc/newrelic-infra/logging.d'
set container name newrelic-vyos volume logging.d source '/config/newrelic/logging.d'
set container name newrelic-vyos volume root destination '/host'
set container name newrelic-vyos volume root source '/'

The container repeatedly failed to start with errors like this:

Feb 15 15:20:40 vyos podman[45955]: 2023-02-15 15:20:40.287991932 -0500 EST m=+0.081074531 image pull  newrelic/infrastructure:latest
Feb 15 15:20:40 vyos podman[45955]: 2023-02-15 15:20:40.396555336 -0500 EST m=+0.189637635 container create 77a1d44c8c7ed7f38b0ca51bd8f448a01733e88b4350f7fc8a10f144e70144af (image=docker.io/newrelic/infrastructure:latest, name=newrelic-vyos, com.newrelic.description=New Relic Infrastructure agent for monitoring the underlying host., com.newrelic.image.version=1.37.1-rc, com.newrelic.infra-agent.version=1.37.1, [email protected], com.newrelic.nri-docker.version=1.7.5, com.newrelic.nri-flex.version=1.7.0, com.newrelic.nri-prometheus.version=2.17.0, PODMAN_SYSTEMD_UNIT=vyos-container-newrelic-vyos.service)
Feb 15 15:20:40 vyos podman[45955]: Error: OCI runtime error: crun: writing file `/sys/fs/cgroup/cgroup.subtree_control`: Invalid argument
Feb 15 15:20:40 vyos podman[45970]: 2023-02-15 15:20:40.75551926 -0500 EST m=+0.191102848 container cleanup 77a1d44c8c7ed7f38b0ca51bd8f448a01733e88b4350f7fc8a10f144e70144af (image=docker.io/newrelic/infrastructure:latest, name=newrelic-vyos, com.newrelic.description=New Relic Infrastructure agent for monitoring the underlying host., com.newrelic.image.version=1.37.1-rc, com.newrelic.infra-agent.version=1.37.1, [email protected], com.newrelic.nri-docker.version=1.7.5, com.newrelic.nri-flex.version=1.7.0, com.newrelic.nri-prometheus.version=2.17.0, PODMAN_SYSTEMD_UNIT=vyos-container-newrelic-vyos.service)
Feb 15 15:20:41 vyos podman[45980]: time="2023-02-15T15:20:41-05:00" level=warning msg="Switching default driver from overlay2 to the equivalent overlay driver"
Feb 15 15:20:41 vyos podman[45980]: 2023-02-15 15:20:41.141374493 -0500 EST m=+0.129922772 container remove 77a1d44c8c7ed7f38b0ca51bd8f448a01733e88b4350f7fc8a10f144e70144af (image=docker.io/newrelic/infrastructure:latest, name=newrelic-vyos, com.newrelic.nri-docker.version=1.7.5, com.newrelic.nri-flex.version=1.7.0, com.newrelic.nri-prometheus.version=2.17.0, PODMAN_SYSTEMD_UNIT=vyos-container-newrelic-vyos.service, com.newrelic.description=New Relic Infrastructure agent for monitoring the underlying host., com.newrelic.image.version=1.37.1-rc, com.newrelic.infra-agent.version=1.37.1, [email protected])
Feb 15 15:20:41 vyos podman[45980]: 77a1d44c8c7ed7f38b0ca51bd8f448a01733e88b4350f7fc8a10f144e70144af

I’ll try to spin up an older 1.4-rolling to see if I can reproduce a time when it did work

Does this help or is it different? I tried busybox as above and got more errors:

vyos@vyos# set container name busybox allow-host-networks
[edit]
vyos@vyos# set container name busybox image 'busybox'
[edit]
vyos@vyos# commit
[ container ]

WARNING: Image "busybox" used in container "busybox" does not exist
locally. Please use "add container image busybox" to add it to the
system! Container "busybox" will not be started!

VyOS had an issue completing a command.

We are sorry that you encountered a problem while using VyOS.
There are a few things you can do to help us (and yourself):
- Contact us using the online help desk if you have a subscription:
  https://support.vyos.io/
- Make sure you are running the latest version of VyOS available at:
  https://vyos.net/get/
- Consult the community forum to see how to handle this issue:
  https://forum.vyos.io
- Join us on Slack where our users exchange help and advice:
  https://vyos.slack.com

When reporting problems, please include as much information as possible:
- do not obfuscate any data (feel free to contact us privately if your 
  business policy requires it)
- and include all the information presented below

Report time:      2023-02-15 16:13:24
Image version:    VyOS 1.4-rolling-202302120317
Release train:    current

Built by:         [email protected]
Built on:         Sun 12 Feb 2023 03:17 UTC
Build UUID:       24bdfade-126b-4691-80c6-cab1fdfd73f8
Build commit ID:  b00c41e6a547a3

Architecture:     x86_64
Boot via:         installed image
System type:      Xen HVM guest

Hardware vendor:  Xen
Hardware model:   HVM domU
Hardware S/N:     9955f378-decb-526a-9ff4-47a21a6c0001
Hardware UUID:    9955f378-decb-526a-9ff4-47a21a6c0001

Traceback (most recent call last):
  File "/usr/libexec/vyos/conf_mode/container.py", line 402, in <module>
    apply(c)
  File "/usr/libexec/vyos/conf_mode/container.py", line 390, in apply
    cmd(f'systemctl restart vyos-container-{name}.service')
  File "/usr/lib/python3/dist-packages/vyos/util.py", line 161, in cmd
    raise OSError(code, feedback)
PermissionError: [Errno 1] failed to run command: systemctl restart vyos-container-newrelic-vyos.service
returned: 
exit code: 1

noteworthy:
cmd 'podman image exists busybox'
returned (out):

returned (err):
time="2023-02-15T16:13:21-05:00" level=warning msg="Switching default driver from overlay2 to the equivalent overlay driver"
cmd 'podman image exists newrelic/infrastructure:latest'
returned (out):

returned (err):
time="2023-02-15T16:13:21-05:00" level=warning msg="Switching default driver from overlay2 to the equivalent overlay driver"
cmd 'podman image exists busybox'
returned (out):

returned (err):
time="2023-02-15T16:13:22-05:00" level=warning msg="Switching default driver from overlay2 to the equivalent overlay driver"
cmd 'podman image exists newrelic/infrastructure:latest'
returned (out):

returned (err):
time="2023-02-15T16:13:22-05:00" level=warning msg="Switching default driver from overlay2 to the equivalent overlay driver"
cmd 'systemctl restart vyos-container-newrelic-vyos.service'
returned (out):

returned (err):
Job for vyos-container-newrelic-vyos.service failed because the control process exited with error code.
See "systemctl status vyos-container-newrelic-vyos.service" and "journalctl -xe" for details.

[[container]] failed
Commit failed

mathias · February 15, 2023, 9:51pm

To answer my own question, I dug a bit.

Container storage is configured here:

/etc/containers/storage.conf

Which puts graphroot in

/usr/lib/live/mount/persistence/container/storage

That persists when switching images, so that’s possibly the explanation why I’m seeing issues. While overlay2 is the driver in both instances, 1.3 uses Podman 3.0.1 vs 4.3.1 in 1.4.

EDIT:
got it resolved by:

disabling all containers in 1.4
deleting the directoy content of /usr/lib/live/mount/persistence/container/storage
switching to 1.3
disabling all containers
adding the images of the containers with add container image
enabling all containers

No more error /warning in 1.3 after that. Doesn’t explain I suddenly got the KeyError messages and had to add keys that should have default values. Will report back if I can identify that one, too.

edan · February 15, 2023, 10:24pm

Thanks for replying! Maybe my problem has come from upgrading and then having a problem with the container storage.

I can report on a clean VyOS 1.4-rolling-202209300217 the same config as above started up the container without errors. I’ll try a clean install of the 1.4-rolling-202302120317 and see if that works too.

anon10687249 · February 16, 2023, 4:03pm

The issue was reported here, with a workaround for now.

https://vyos.dev/T4978

system · February 18, 2023, 4:04pm

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.