Commit causing vyos-configd.service to exit and restart

I am trying to setup a zone based firewall with two bonded Ethernet interfaces, one WLAN interface, a PPPoE interface and nine VLAN interfaces on the bonded interface. The firewall is already setup and committed, but as soon as I try and add a NAT rule, the config daemon crashes. Modifying an existing firewall rule does not cause the crash. I’ve only been able to trigger this when adding a NAT rule.

The config file is 117kB in size and contains 5196 lines. I’ve also removed an interface and its related firewall rules to reduce the size but the crash is still triggered.

The “init” message appear in the log followed by the config daemon restart.

vyos-configd[21197]: Received message: {“type”: “init”}

Config daemon exists and restart

vyos-configd[21197]: VyOS had an issue completing a command.
vyos-configd[21197]: We are sorry that you encountered a problem while using VyOS.
vyos-configd[21197]: There are a few things you can do to help us (and yourself):
vyos-configd[21197]: - Make sure you are running the latest version of the code available at
vyos-configd[21197]: REMOVED DUE TO FORUM LIMITS
vyos-configd[21197]: - Consult the forum to see how to handle this issue
vyos-configd[21197]: REMOVED DUE TO FORUM LIMITS
vyos-configd[21197]: - Join our community on slack where our users exchange help and advice
vyos-configd[21197]: REMOVED DUE TO FORUM LIMITS
vyos-configd[21197]: When reporting problems, please include as much information as possible:
vyos-configd[21197]: - do not obfuscate any data (feel free to contact us privately if your
vyos-configd[21197]: business policy requires it)
vyos-configd[21197]: - and include all the information presented below
vyos-configd[21197]: Report Time: 2020-09-24 02:40:46
vyos-configd[21197]: Image Version: VyOS 1.3-rolling-202009220118
vyos-configd[21197]: Release Train: equuleus
vyos-configd[21197]: Built by: REMOVED DUE TO FORUM LIMITS
vyos-configd[21197]: Built on: Tue 22 Sep 2020 01:18 UTC
vyos-configd[21197]: Build UUID: 981c6e74-4a07-452a-80b9-8c2d49712f1a
vyos-configd[21197]: Build Commit ID: d571b383797719
vyos-configd[21197]: Architecture: x86_64
vyos-configd[21197]: Boot via: installed image
vyos-configd[21197]: System type: bare metal
vyos-configd[21197]: Hardware vendor: To be filled by O.E.M.
vyos-configd[21197]: Hardware model: To be filled by O.E.M.
vyos-configd[21197]: Hardware S/N: To be filled by O.E.M.
vyos-configd[21197]: Hardware UUID: 03000200-0400-0500-0006-000700080009
vyos-configd[21197]: Traceback (most recent call last):
vyos-configd[21197]: File “/usr/libexec/vyos/services/vyos-configd”, line 238, in
vyos-configd[21197]: config = initialization(socket)
vyos-configd[21197]: File “/usr/libexec/vyos/services/vyos-configd”, line 144, in initialization
vyos-configd[21197]: session_string = socket.recv().decode()
vyos-configd[21197]: UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xf1 in position 127032: invalid continuation byte
Report Time: 2020-09-24 02:40:46
Image Version: VyOS 1.3-rolling-202009220118
Release Train: equuleus
Built by: REMOVED DUE TO FORUM LIMITS
Built on: Tue 22 Sep 2020 01:18 UTC
Build UUID: 981c6e74-4a07-452a-80b9-8c2d49712f1a
Build Commit ID: d571b383797719
Architecture: x86_64
Boot via: installed image
System type: bare metal
Hardware vendor: To be filled by O.E.M.
Hardware model: To be filled by O.E.M.
Hardware S/N: To be filled by O.E.M.
Hardware UUID: 03000200-0400-0500-0006-000700080009
Traceback (most recent call last):
File “/usr/libexec/vyos/services/vyos-configd”, line 238, in
config = initialization(socket)
File “/usr/libexec/vyos/services/vyos-configd”, line 144, in initialization
session_string = socket.recv().decode()
UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xf1 in position 127032: invalid continuation byte
systemd[1]: vyos-configd.service: Main process exited, code=exited, status=1/FAILURE
systemd[1]: vyos-configd.service: Failed with result ‘exit-code’.
systemd[1]: vyos-configd.service: Service RestartSec=100ms expired, scheduling restart.
systemd[1]: vyos-configd.service: Scheduled restart job, restart counter is at 8.
systemd[1]: Stopped VyOS configuration daemon.
systemd[1]: Started VyOS configuration daemon.
systemd[1]: opt-vyatta-config-tmp-new_config_20243.mount: Succeeded.

Thank you, @choopeek, for the report. I will open a https://phabricator.vyos.net/ task to track this issue, and add the link here.

@choopeek the phabricator task is here:
⚓ T2931 Unicode decode error causes vyos.configd service to restart
Details about VyOS Phabricator are here:
Issues/Feature requests — VyOS 1.4.x (sagitta) documentation
To help reproduce, it would be useful if you can provide a (suitably anonymized) config file.

I’m affected by this issue (or a similar one) as well. But for me it’s not happening when adding NAT rules (which I did not try). So far it happened while:

  • adding a redirect and a traffic-policy to an existing pppoe interface
  • adding the zone node to an existing cloudflare dynamic dns configuration

To me it seems like it’s more an issue with the config system in general, and not caused by specific config nodes.

Here is one config change I tried and the corresponding vyos-configd log:

Config diff:

[edit service dns dynamic interface pppoe0 service cloudflare]
+zone example.com

vyos-configd log:

Oct 03 16:51:36 r1-ham vyos-configd[13352]: Received message: {"type": "init"}
Oct 03 16:51:40 r1-ham vyos-configd[13352]: VyOS had an issue completing a command.
Oct 03 16:51:40 r1-ham vyos-configd[13352]: We are sorry that you encountered a problem while using VyOS.
Oct 03 16:51:40 r1-ham vyos-configd[13352]: There are a few things you can do to help us (and yourself):
Oct 03 16:51:40 r1-ham vyos-configd[13352]: - Make sure you are running the latest version of the code available at
Oct 03 16:51:40 r1-ham vyos-configd[13352]:   https://downloads.vyos.io/rolling/current/amd64/vyos-rolling-latest.iso
Oct 03 16:51:40 r1-ham vyos-configd[13352]: - Consult the forum to see how to handle this issue
Oct 03 16:51:40 r1-ham vyos-configd[13352]:   https://forum.vyos.io                                                                                                                                              
Oct 03 16:51:40 r1-ham vyos-configd[13352]: - Join our community on slack where our users exchange help and advice
Oct 03 16:51:40 r1-ham vyos-configd[13352]:   https://vyos.slack.com
Oct 03 16:51:40 r1-ham vyos-configd[13352]: When reporting problems, please include as much information as possible:
Oct 03 16:51:40 r1-ham vyos-configd[13352]: - do not obfuscate any data (feel free to contact us privately if your
Oct 03 16:51:40 r1-ham vyos-configd[13352]:   business policy requires it)
Oct 03 16:51:40 r1-ham vyos-configd[13352]: - and include all the information presented below
Oct 03 16:51:40 r1-ham vyos-configd[13352]: Report Time:      2020-10-03 16:51:40
Oct 03 16:51:40 r1-ham vyos-configd[13352]: Image Version:    VyOS 1.3-rolling-202010020117
Oct 03 16:51:40 r1-ham vyos-configd[13352]: Release Train:    equuleus
Oct 03 16:51:40 r1-ham vyos-configd[13352]: Built by:         autobuild@vyos.net
Oct 03 16:51:40 r1-ham vyos-configd[13352]: Built on:         Fri 02 Oct 2020 01:17 UTC
Oct 03 16:51:40 r1-ham vyos-configd[13352]: Build UUID:       d1d2513f-ed8f-49db-9ad0-495d8a2e07a7
Oct 03 16:51:40 r1-ham vyos-configd[13352]: Build Commit ID:  8890819012f9a5
Oct 03 16:51:40 r1-ham vyos-configd[13352]: Architecture:     x86_64
Oct 03 16:51:40 r1-ham vyos-configd[13352]: Boot via:         installed image
Oct 03 16:51:40 r1-ham vyos-configd[13352]: System type:      bare metal
Oct 03 16:51:40 r1-ham vyos-configd[13352]: Hardware vendor:  FUJITSU
Oct 03 16:51:40 r1-ham vyos-configd[13352]: Hardware model:   ESPRIMO P700
Oct 03 16:51:40 r1-ham vyos-configd[13352]: Hardware S/N:
Oct 03 16:51:40 r1-ham vyos-configd[13352]: Hardware UUID:    766b6560-ed64-e011-8b11-67f76525ff89
Oct 03 16:51:40 r1-ham vyos-configd[13352]: Traceback (most recent call last):
Oct 03 16:51:40 r1-ham vyos-configd[13352]:   File "/usr/libexec/vyos/services/vyos-configd", line 238, in <module>
Oct 03 16:51:40 r1-ham vyos-configd[13352]:     config = initialization(socket)
Oct 03 16:51:40 r1-ham vyos-configd[13352]:   File "/usr/libexec/vyos/services/vyos-configd", line 144, in initialization
Oct 03 16:51:40 r1-ham vyos-configd[13352]:     session_string = socket.recv().decode()
Oct 03 16:51:40 r1-ham vyos-configd[13352]: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf1 in position 127032: invalid continuation byte
Oct 03 16:51:40 r1-ham python3[13352]: Report Time:      2020-10-03 16:51:40
Oct 03 16:51:40 r1-ham python3[13352]: Image Version:    VyOS 1.3-rolling-202010020117
Oct 03 16:51:40 r1-ham python3[13352]: Release Train:    equuleus
Oct 03 16:51:40 r1-ham python3[13352]: Built by:         autobuild@vyos.net
Oct 03 16:51:40 r1-ham python3[13352]: Built on:         Fri 02 Oct 2020 01:17 UTC
Oct 03 16:51:40 r1-ham python3[13352]: Build UUID:       d1d2513f-ed8f-49db-9ad0-495d8a2e07a7
Oct 03 16:51:40 r1-ham python3[13352]: Build Commit ID:  8890819012f9a5
Oct 03 16:51:40 r1-ham Architecture[13352]:     x86_64
Oct 03 16:51:40 r1-ham python3[13352]: Boot via:         installed image
Oct 03 16:51:40 r1-ham python3[13352]: System type:      bare metal
Oct 03 16:51:40 r1-ham python3[13352]: Hardware vendor:  FUJITSU
Oct 03 16:51:40 r1-ham python3[13352]: Hardware model:   ESPRIMO P700
Oct 03 16:51:40 r1-ham python3[13352]: Hardware S/N:
Oct 03 16:51:40 r1-ham python3[13352]: Hardware UUID:    766b6560-ed64-e011-8b11-67f76525ff89
Oct 03 16:51:40 r1-ham python3[13352]: Traceback (most recent call last):
Oct 03 16:51:40 r1-ham python3[13352]:   File "/usr/libexec/vyos/services/vyos-configd", line 238, in <module>
Oct 03 16:51:40 r1-ham python3[13352]:     config = initialization(socket)
Oct 03 16:51:40 r1-ham python3[13352]:   File "/usr/libexec/vyos/services/vyos-configd", line 144, in initialization
Oct 03 16:51:40 r1-ham python3[13352]:     session_string = socket.recv().decode()
Oct 03 16:51:40 r1-ham UnicodeDecodeError[13352]: 'utf-8' codec can't decode byte 0xf1 in position 127032: invalid continuation byte
Oct 03 16:51:40 r1-ham systemd[1]: vyos-configd.service: Main process exited, code=exited, status=1/FAILURE
Oct 03 16:51:40 r1-ham systemd[1]: vyos-configd.service: Failed with result 'exit-code'.
Oct 03 16:51:40 r1-ham systemd[1]: vyos-configd.service: Service RestartSec=100ms expired, scheduling restart.
Oct 03 16:51:40 r1-ham systemd[1]: vyos-configd.service: Scheduled restart job, restart counter is at 4.
Oct 03 16:51:40 r1-ham systemd[1]: Stopped VyOS configuration daemon.
Oct 03 16:51:40 r1-ham systemd[1]: Started VyOS configuration daemon.

If required I can provide a sanitized config

@mpsl A sanitized config would be very useful, as I have not yet reproduced this error signature. If possible, I would also suggest trying the most recent rolling release
VyOS Community
which has a fix for the possibly related bug:
⚓ T2952 configd: timeout breaks synchronization of messages, causing freeze
Thanks!

@jestabro alright, I’ll try that release; probably tonight.

For now here is my config: https://pastebin.com/Bep572kU
It’s a bit of a mess, sorry about that ^^

@mpsl How are you running vyos? I see that the version info says ‘bare metal’, but that is a false negative (i.e., not VM) if one is running (not building, but running) within docker — just a thought in case you are doing so. Thanks.

@jestabro, in this case it’s correct, vyos is running on bare metal.

I just updated to the latest rolling and tried to commit a simple change to service dns dynamic:

vyos@r1-ham# sudo journalctl -eu vyos-configd
Oct 06 13:34:34 r1-ham vyos-configd[547]: cmd '/sbin/ethtool -K eth2 ufo off'
Oct 06 13:34:34 r1-ham vyos-configd[547]: returned (out):
Oct 06 13:34:34 r1-ham vyos-configd[547]: returned (err):
Oct 06 13:34:34 r1-ham vyos-configd[547]: Cannot change udp-fragmentation-offload
Oct 06 13:34:34 r1-ham python3[547]: Report Time:      2020-10-06 13:34:34
Oct 06 13:34:34 r1-ham python3[547]: Image Version:    VyOS 1.3-rolling-202010060706
Oct 06 13:34:34 r1-ham python3[547]: Release Train:    equuleus
Oct 06 13:34:34 r1-ham python3[547]: Built by:         autobuild@vyos.net
Oct 06 13:34:34 r1-ham python3[547]: Built on:         Tue 06 Oct 2020 07:06 UTC
Oct 06 13:34:34 r1-ham python3[547]: Build UUID:       378e0d1e-6206-45c3-a634-84de5bf5642c
Oct 06 13:34:34 r1-ham python3[547]: Build Commit ID:  4ec212ad33c9c2
Oct 06 13:34:34 r1-ham Architecture[547]:     x86_64
Oct 06 13:34:34 r1-ham python3[547]: Boot via:         installed image
Oct 06 13:34:34 r1-ham python3[547]: System type:      bare metal
Oct 06 13:34:34 r1-ham python3[547]: Hardware vendor:  FUJITSU
Oct 06 13:34:34 r1-ham python3[547]: Hardware model:   ESPRIMO P700
Oct 06 13:34:34 r1-ham python3[547]: Hardware S/N:
Oct 06 13:34:34 r1-ham python3[547]: Hardware UUID:    766b6560-ed64-e011-8b11-67f76525ff89
Oct 06 13:34:34 r1-ham python3[547]: Traceback (most recent call last):
Oct 06 13:34:34 r1-ham python3[547]:   File "/usr/libexec/vyos/services/vyos-configd", line 250, in <module>
Oct 06 13:34:34 r1-ham python3[547]:     config = initialization(socket)
Oct 06 13:34:34 r1-ham python3[547]:   File "/usr/libexec/vyos/services/vyos-configd", line 152, in initialization
Oct 06 13:34:34 r1-ham python3[547]:     session_string = socket.recv().decode()
Oct 06 13:34:34 r1-ham UnicodeDecodeError[547]: 'utf-8' codec can't decode byte 0xf1 in position 127032: invalid continuation byte
Oct 06 13:34:34 r1-ham noteworthy[547]:
Oct 06 13:34:34 r1-ham python3[547]: cmd '/sbin/ethtool -K eth3 ufo off'
Oct 06 13:34:34 r1-ham python3[547]: returned (out):
Oct 06 13:34:34 r1-ham python3[547]: returned (err):
Oct 06 13:34:34 r1-ham python3[547]: Cannot change udp-fragmentation-offload
Oct 06 13:34:34 r1-ham python3[547]: cmd '/sbin/ethtool -K eth4 ufo off'
Oct 06 13:34:34 r1-ham python3[547]: returned (out):
Oct 06 13:34:34 r1-ham python3[547]: returned (err):
Oct 06 13:34:34 r1-ham python3[547]: Cannot change udp-fragmentation-offload
Oct 06 13:34:34 r1-ham python3[547]: cmd '/sbin/ethtool -K eth1 ufo off'
Oct 06 13:34:34 r1-ham python3[547]: returned (out):
Oct 06 13:34:34 r1-ham python3[547]: returned (err):
Oct 06 13:34:34 r1-ham python3[547]: Cannot change udp-fragmentation-offload
Oct 06 13:34:34 r1-ham python3[547]: cmd '/sbin/ethtool -K eth0 ufo off'
Oct 06 13:34:34 r1-ham python3[547]: returned (out):
Oct 06 13:34:34 r1-ham python3[547]: returned (err):
Oct 06 13:34:34 r1-ham python3[547]: Cannot change udp-fragmentation-offload
Oct 06 13:34:34 r1-ham python3[547]: cmd '/sbin/ethtool -K eth2 ufo off'
Oct 06 13:34:34 r1-ham python3[547]: returned (out):
Oct 06 13:34:34 r1-ham python3[547]: returned (err):
Oct 06 13:34:34 r1-ham python3[547]: Cannot change udp-fragmentation-offload
Oct 06 13:34:34 r1-ham systemd[1]: vyos-configd.service: Main process exited, code=exited, status=1/FAILURE
Oct 06 13:34:34 r1-ham systemd[1]: vyos-configd.service: Failed with result 'exit-code'.
Oct 06 13:34:35 r1-ham systemd[1]: vyos-configd.service: Service RestartSec=100ms expired, scheduling restart.
Oct 06 13:34:35 r1-ham systemd[1]: vyos-configd.service: Scheduled restart job, restart counter is at 1.
Oct 06 13:34:35 r1-ham systemd[1]: Stopped VyOS configuration daemon.
Oct 06 13:34:35 r1-ham systemd[1]: Started VyOS configuration daemon.

Thanks @mpsl this information is very useful — the ethtool errors are an artifact of the restart, as were seen in the resolved bug T2952 mentioned above. I wanted to rule out running in Docker (possible, thanks to recent work by a colleague, but not common) due to known locale issues; by the way, what is the output of ‘locale’ on that system?

@jestabro locale looks normal to me:

 vyos@r1-ham# locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

I’ve been running into this issue as well when I paste set commands.

Is there a workaround?

Oct 07 13:48:41 core2.sba vyos-configd[10541]: Traceback (most recent call last):
Oct 07 13:48:41 core2.sba vyos-configd[10541]:   File "/usr/libexec/vyos/services/vyos-configd", line 217, in <module>
Oct 07 13:48:41 core2.sba vyos-configd[10541]:     config = initialization(socket)
Oct 07 13:48:41 core2.sba vyos-configd[10541]:   File "/usr/libexec/vyos/services/vyos-configd", line 132, in initialization
Oct 07 13:48:41 core2.sba vyos-configd[10541]:     session_string = socket.recv().decode()
Oct 07 13:48:41 core2.sba vyos-configd[10541]: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf1 in position 127032: invalid continuation byte

@JessterSB not as of yet, but I will update the forum post as soon as available

I’m not a programmer by any means, but I made a change to

/usr/libexec/vyos/services/vyos-configd

which is now letting me commit.

Where I saw the decode() function, I inserted arguments “utf-8”, “ignore”

--- /usr/libexec/vyos/services/vyos-configd.backup      2020-10-08 12:18:25.612000000 +0000
+++ /usr/libexec/vyos/services/vyos-configd     2020-10-08 13:26:11.084603235 +0000
@@ -136,7 +136,7 @@
     session_string = ''
     # check first for resent init msg, in case of client timeout
     while True:
-        msg = socket.recv().decode()
+        msg = socket.recv().decode("utf-8", "ignore")
         try:
             message = json.loads(msg)
             if message["type"] == "init":
@@ -149,10 +149,10 @@
     active_string = msg
     resp = "active"
     socket.send(resp.encode())
-    session_string = socket.recv().decode()
+    session_string = socket.recv().decode("utf-8", "ignore")
     resp = "session"
     socket.send(resp.encode())
-    pid_string = socket.recv().decode()
+    pid_string = socket.recv().decode("utf-8", "ignore")
     resp = "pid"
     socket.send(resp.encode())

@@ -240,7 +240,7 @@

     while True:
         #  Wait for next request from client
-        msg = socket.recv().decode()
+        msg = socket.recv().decode("utf-8", "ignore")
         logger.debug(f"Received message: {msg}")
         message = json.loads(msg)

@JessterSB yes, that is reasonable for this issue, and I will make that change as a temporary workaround until we have a root cause. Notice that in the three separate reports it is the same byte in the same position independent of configuration, so clearly spurious. Thanks for the information.

I did notice that - I spent some time in python trying to locate / seek to position 127032, which doesn’t exist for me.

Task description updated:
https://phabricator.vyos.net/T2931
and workaround added.