DNS forwarding config disappears every reboot

dns
bug
powerdns

#1

My configuration for DNS forwarding disappears every reboot - Other settings stick no problem except for DNS forwarding.
This is using the latest 1.2.0-rolling+201805050337, but has been occurring for quite a while now.

DNS forwarding applies fine after running commit and save, changes are also reflected in /config/config.boot
but after rebooting, dns forwarding is not running, and the forwarding entry is missing from the config.

    dns {
    forwarding {
        cache-size 0
        listen-on eth0
        name-server 10.0.0.3
    }

After the reboot, I need to re-configure dns forwarding, which works until I reboot again.


#3

Unfortunately I can’t reproduce this issue using build: VyOS 1.2.0-rolling+201805060337

cpo@LR1# show service
 dns {
     forwarding {
         cache-size 0
         ignore-hosts-file
         listen-on eth0
         name-server 9.9.9.9
     }
 }

I added a fresh dns forwarding configuration, saved it, rebootet and it was still there. Also the “real” powerdns configuration is vaild

~$ cat /etc/powerdns/recursor.conf
### Autogenerated by vyos-config-dns-forwarding.py on Sun, 06 May 2018 15:04:20 ###
daemon=yes
threads=1
allow-from=0.0.0.0/0
log-common-errors=yes
local-address=172.16.37.240, fe80::250:56ff:feaa:8b61%eth0
max-cache-entries=0
export-etc-hosts=no

# statically configured: 9.9.9.9
forward-zones-recurse=.=9.9.9.9

#4

I just updated to the latest 1.2.0-rolling+201805061925, and ran tests. The setting seems to persist across reboots now, and config outputs are correct, for both show commands, and looking at powerdns recursor.conf but forwarding doesn’t actually seem to work.
Trying to reset the service as below doesn’t seem to work either.

vyos@vyos:~$ reset dns forwarding all 
DNS forwarding not configured

Maybe I’m forgetting to set something in my config?

vyos@vyos:~$ sh conf
firewall {
all-ping enable
broadcast-ping disable
config-trap disable
ipv6-receive-redirects disable
ipv6-src-route disable
ip-src-route disable
log-martians disable
name INSIDE-LOCAL {
    default-action accept
}
name INSIDE-OUTSIDE {
    default-action accept
    rule 10 {
        action accept
        state {
            established enable
            related enable
        }
    }
    rule 20 {
        action drop
        destination {
            address 224.0.0.0/4
        }
        protocol udp
    }
    rule 30 {
        action drop
        destination {
            address 255.255.255.255
        }
        protocol udp
    }
}
name LOCAL-INSIDE {
    default-action accept
}
name LOCAL-OUTSIDE {
    default-action accept
    rule 20 {
        action drop
        destination {
            address 224.0.0.0/4
        }
        protocol udp
    }
    rule 30 {
        action drop
        destination {
            address 255.255.255.255
        }
        protocol udp
    }
}
name OUTSIDE-INSIDE {
    default-action drop
    rule 10 {
        action accept
        state {
            established enable
            related enable
        }
    }
}
name OUTSIDE-LOCAL {
    default-action drop
    rule 10 {
        action accept
        state {
            established enable
            related enable
        }
    }
    rule 20 {
        action accept
        description OpenVPN
        destination {
            port 1195
        }
        protocol udp
    }
}
receive-redirects disable
send-redirects enable
source-validation disable
syn-cookies enable
twa-hazards-protection disable
}
interfaces {
ethernet eth0 {
    address 10.0.0.1/24
    description LAN
    duplex auto
    hw-id REDACTED
    smp-affinity auto
    speed auto
}
ethernet eth1 {
    address dhcp
    description WAN
    duplex auto
    hw-id REDACTED
    smp-affinity auto
    speed auto
}
loopback lo {
}
openvpn vtun0 {
    encryption aes256
    hash sha512
    local-address 10.255.1.1 {
    }
    local-port 1195
    mode site-to-site
    persistent-tunnel
    protocol udp
    remote-address 10.255.1.2
    remote-host REDACTED
    remote-port 1195
    tls {
        ca-cert-file /config/auth/ca.crt
        cert-file /config/auth/server0.crt
        dh-file /config/auth/dh2048.pem
        key-file /config/auth/server0.key
        role passive
    }
}
}
nat {
destination {
}
source {
    rule 10 {
        outbound-interface eth1
        source {
            address 10.0.0.0/24
        }
        translation {
            address masquerade
        }
    }
}
}
protocols {
static {
    interface-route 10.0.1.0/24 {
        next-hop-interface vtun0 {
        }
    }
}
}
service {
dhcp-server {
    disabled false
    hostfile-update enable
    shared-network-name local {
        authoritative enable
        subnet 10.0.0.0/24 {
            default-router 10.0.0.1
            dns-server 10.0.0.1
            domain-name local
            domain-search local
            lease 86400
            start 10.0.0.50 {
                stop 10.0.0.240
            }
        }
    }
}
dns {
    dynamic {
        interface eth1 {
            service zoneedit {
                host-name REDACTED
                login REDACTED
                password ****************
            }
        }
    }
    forwarding {
        cache-size 0
        listen-on eth0
        name-server 10.0.0.3
    }
}
ssh {
    listen-address 10.0.0.1
    port 22
}
}
system {
config-management {
    commit-revisions 20
}
console {
    device ttyS0 {
        speed 9600
    }
}
domain-name local
host-name vyos
login {
    user vyos {
        authentication {
            encrypted-password ****************
            plaintext-password ****************
        }
        level admin
    }
}
name-server 10.0.0.1
ntp {
    server 0.pool.ntp.org {
    }
    server 1.pool.ntp.org {
    }
    server 2.pool.ntp.org {
    }
}
options {
    ctrl-alt-del-action ignore
    reboot-on-panic true
}
syslog {
    global {
        archive {
            files 3
            size 100
        }
        facility all {
            level notice
        }
        facility protocols {
            level notice
        }
    }
}
time-zone America/Los_Angeles
}
zone-policy {
zone INSIDE {
    default-action drop
    from LOCAL {
        firewall {
            name LOCAL-INSIDE
        }
    }
    from OUTSIDE {
        firewall {
            name OUTSIDE-INSIDE
        }
    }
    interface eth0
    interface vtun0
}
zone LOCAL {
    default-action drop
    from INSIDE {
        firewall {
            name INSIDE-LOCAL
        }
    }
    from OUTSIDE {
        firewall {
            name OUTSIDE-LOCAL
        }
    }
    local-zone
}
zone OUTSIDE {
    default-action drop
    from INSIDE {
        firewall {
            name INSIDE-LOCAL
        }
    }
    from LOCAL {
        firewall {
            name LOCAL-OUTSIDE
        }
    }
    interface eth1
}
}

#5

For whatever reason, my forwarding entries no longer disappear no matter what which is nice. I just did a clean install just to be sure I’m working from a clean slate.

I still have an issue (related? not sure) with forwarding.
The service doesn’t appear to automatically start successfully upon boot, despite being configured.

Checking the service status with systemctl upon a reboot shows the following:

root@vyos:/home/vyos# systemctl status pdns-recursor.service
● pdns-recursor.service - PowerDNS Recursor
Loaded: loaded (/lib/systemd/system/pdns-recursor.service; disabled)
Active: failed (Result: exit-code) since Tue 2018-05-08 01:10:07 PDT; 2min 54s ago
Main PID: 3078 (code=exited, status=1/FAILURE)

//EXTRA LINES REMOVED
May 08 01:10:07 vyos pdns_recursor[3078]: May 08 01:10:07 Exception: Resolver binding to server socket on port 53 for fe80::7a45:c4ff:fe34:e3c1%eth0: Cannot assign requested address
May 08 01:10:07 vyos systemd[1]: pdns-recursor.service: main process exited, code=exited, status=1/FAILURE
May 08 01:10:07 vyos systemd[1]: Unit pdns-recursor.service entered failed state.

Telling the service to restart with no changes whatsoever allows the service to run just fine:

root@vyos:/home/vyos# systemctl restart pdns-recursor.service
root@vyos:/home/vyos# systemctl status pdns-recursor.service
● pdns-recursor.service - PowerDNS Recursor
  Loaded: loaded (/lib/systemd/system/pdns-recursor.service; disabled)
  Active: active (running) since Tue 2018-05-08 01:14:56 PDT; 1s ago
Main PID: 3463 (pdns_recursor)
CGroup: /system.slice/pdns-recursor.service
       └─3463 /usr/sbin/pdns_recursor --daemon=no

What else works is if I go into conf and remove or change any entry related to forwarding, then commit. This will immediately start the forwarding service without issue.

Anybody have any clues as to what’s going on?
Sorry that my discussion has diverged from my original topic a bit - maybe the topic can be moved to a more appropriate category?


#6

I may have spoken too soon.
It looks like the settings are disappearing again, and of course forwarding doesn’t work until I set it up again in configure

Any files or logs I can provide to help get to the bottom of this?
I still think this is a bug as I have not modified the install in any way, only making changes via configure.


#7

I have managed to reproduce the “disappearing config” fairly easily and consistently in a VM test environment with a fresh vyos install.
My test environment is a simple kvm vm with a single virtio disk, and two virtio NICs, one connected to a dns server at 192.168.0.7

NICs are configured as follows:
eth0 192.168.0.177/24
eth1 10.0.0.1/24

Then, set up dns forwarding:
set service dns forwarding listen-on eth1
set service dns forwarding name-server 192.168.0.7

Commit and save shows pdns recursor running.
Reboot and pdns recursor will start with no issue.
Status verified with systemctl status pdns-recursor
Running show configuration also results in expected output.

Now, after poweroff, set eth1’s nic connection to disconnected (simulating unplugged cable), power-on vyos test environment, and observe that show configuration is blank in the dns forwarding section.
Also observe that running systemctl status pdns-recursor shows that pdns-recursor is not running.
running systemctl start pdns-recursor will fail just like in my production machine, with complaints about binding to an address.

If you then reconnect the virtual cable, and run systemctl start pdns-recursor, and then check the status, you will see that it starts up no problem.
Running show configuration after this will not restore the “disappearing” configuration. The missing configuration will only reappear after rebooting with the cable connected…

While I have no spare bare metal hardware to test this on, I suspect we will see the same behavior by physically unplugging cables.

My theory is that the timing of interfaces coming up and assigning of IP addresses is too slow (since on my real machine I’m obviously NOT unplugging cables), and that pdns-recursor tries to start before the interface is fully up and ready.

Is there anything I can do to combat this? Is this a bug?