Many nodes call api to change the same settings cause configuration lost

Hi All,
I met an issue that I write a shell script to check FQDN, then call api to write the IP to the firewall group of the server. I have 8 nodes to do this with Crontab. Sometimes it will cause the configuration of the server node to lose most part of it. Any idea to fix this bug?
Currently, VyOS LTS 1.3.3 still have this issue.

@echowings just to clarify: do you mean that an existing section of the configuration is corrupted, or that intended settings are not successfully set ?

The setting is okay. But when multiple nodes update the firewall group with API function at the same time. the `/config/config.boot will be crashed, It will be missing a snippet of configuration. When VyOS reboot, it will not be working anymore.

Thanks, @echowings I will investigate reproducing the error and follow up.

1 Like

@echowings if you could share some details of your script, that would be helpful: I did not expect, nor do I see, any issue under load from multiple connections with ‘configure’ requests — does your script add ‘save’ requests following the configure, or other ? Thanks.

Here you go. BTW, only one node update with api is fine. if more than 1 node call api to update network ip will cause config.boot broken.

#!/bin/bash
...


I am suspicious of concurrent calls to save the config file; secondly, there is a known issue with concurrent calls to show config, namely, endpoint ‘retrieve’, that is fixed in 1.4. The latter is:
https://vyos.dev/T5006
https://vyos.dev/T5305
(The second fixes an overreach in the first.) It is a 1 line fix, so if you are willing/able, if would suggest:
(1) drop the concurrent saves for the moment, if that is possible in the workflow, and see if that resolves the issue; if not:
(2) if you are willing to modify the server code, add the following and restart the server. The details of what is going on with the framework/async questions are described in the T5305 task.

diff --git a/src/services/vyos-http-api-server b/src/services/vyos-http-api-server
index ece3aa9b2..a5dccb3b4 100755
--- a/src/services/vyos-http-api-server
+++ b/src/services/vyos-http-api-server
@@ -480,7 +480,7 @@ def configure_op(data: Union[ConfigureModel, ConfigureListModel]):
     return success(None)

 @app.post("/retrieve")
-def retrieve_op(data: RetrieveModel):
+async def retrieve_op(data: RetrieveModel):
     session = app.state.vyos_session
     env = session.get_session_env()
     config = vyos.config.Config(session_env=env)
-- 
2.41.0

I’m certainly very curious to hear any results on the issue; thanks.

I only works on vyos 1.3.3 LTS version. Since VyOS 1.4 didn’t fit for my need now.
The question is, is that works on VyOS 1.3.3 with this patch ?
Any plan to fix the issue on vyos 1.3.x ?
Some features of VyOS 1.3 were lost on VyOS 1.4.

sudo find / | grep vyos-http-api-server
/usr/lib/live/mount/rootfs/1.3.3.squashfs/usr/libexec/vyos/services/vyos-http-api-server
/usr/libexec/vyos/services/vyos-http-api-server

Modify the file like this, right?

sudo vi /usr/libexec/vyos/services/vyos-http-api-server
@app.post("/retrieve")
#def retrieve_op(data: RetrieveModel):
async def retrieve_op(data: RetrieveModel):
    session = app.state.vyos_session
    env = session.get_session_env()
    config = vyos.config.Config(session_env=env)

    op = data.op
    path = " ".join(data.path)

Yes, that is the suggested change; testing and backport for 1.3.x in tasks
https://vyos.dev/T5006
https://vyos.dev/T5305
It appears that the issue affects 1.3.x after
https://vyos.dev/T5176

1 Like

Here is the shell script to modify the file and reboot service to make modify apply.

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.