PowerDNS Recursor feels very slow

Have a wonderful day,

Since 2 weeks I have a fiber optic connection with 1000/250 Mbit which I operate very successfully with a VyOS (1.5-rolling-202404141045). Thank you very much for this software!

I have also activated the pdns-recursor and wanted to operate the VyOS without an external DNS server. This works so far but the resolutions are very very slow. This is clearly noticeable in normal surfing, the line feels like an old DSL line.
As soon as I set up and use external DNS servers again (Quad9, for example), the resolutions are blazingly fast again and it works as it should.

Does anyone here have an idea or can guide me in a certain direction on how to debug the whole thing?

Thank you in advance.

1 Like

Maybe your ISP is blocking some DNS queries from pdns-recursor?
Also it is normal if it is slow initially - the extrernal DNS already has a lot of data in its cache.
Any DNS timeout error messages in the logs?

no, I doubt that my ISP is blocking anything.
I am also aware that the cache has to be filled first, but the queries sometimes take 2-4 seconds. I also have a pdns recursor in operation and it is nowhere near as slow as this one.

Here is some log.

Apr 30 14:51:40 pdns-recursor[128737]: PowerDNS Recursor 4.8.7 (C) 2001-2022 PowerDNS.COM BV
Apr 30 14:51:40 pdns-recursor[128737]: Using 64-bits mode. Built using gcc 12.2.0.
Apr 30 14:51:40 pdns-recursor[128737]: PowerDNS comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it according to the terms of the GPL version 2.
Apr 30 14:51:40 pdns-recursor[128737]: msg="Enabling IPv4 transport for outgoing queries" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.457"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Enabling IPv6 transport for outgoing queries" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.457"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Setting access control" subsystem="config" level="0" prio="Info" tid="0" ts="1714481500.462" acl="allow-from" addresses="192.168.150.0/24 127.0.0.1/32"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Will not send queries to" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.476" addresses="127.0.0.0/8 10.0.0.0/8 100.64.0.0/10 169.254.0.0/16 192.168.0.0/16 172.16.0.0/12 ::1/128 fc00::/7 fe80::/10 0.0.0.0/8 192.0.0.0/24 192.0.2.0/24 198.51.100.0/24 203.0.113.0/24 240.0.0.0/4 ::/96 ::ffff:0:0/96 100::/64 2001:db8::/32 0.0.0.0 ::"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Reading zone forwarding information" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.476" file="/run/pdns-recursor/recursor.forward-zones.conf"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Done parsing forwarding instructions from file" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.476" count="1" file="/run/pdns-recursor/recursor.forward-zones.conf"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Inserting forward zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.476" zone="localhost"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Inserting reverse zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.476" zone="1.0.0.127.in-addr.arpa"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Inserting forward zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.476" zone="vyos"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Inserting reverse zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.476" zone="1.1.0.127.in-addr.arpa"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Will not overwrite already loaded zone" subsystem="config" level="0" prio="Warning" tid="0" ts="1714481500.476" zone="localhost"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Inserting forward zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.476" zone="ip6-localhost"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Inserting forward zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.476" zone="ip6-loopback"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Inserting reverse zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.476" zone="1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Inserting forward zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.476" zone="ip6-localnet"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Inserting reverse zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.476" zone="0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.e.f.ip6.arpa"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Inserting forward zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.476" zone="ip6-mcastprefix"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Inserting reverse zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.476" zone="0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.f.f.ip6.arpa"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Inserting forward zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.476" zone="ip6-allnodes"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Inserting reverse zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.476" zone="1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.f.f.ip6.arpa"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Inserting forward zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.476" zone="ip6-allrouters"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Inserting reverse zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.477" zone="2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.f.f.ip6.arpa"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Inserting forward zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.477" zone="glasfasermodem"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Inserting reverse zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.477" zone="1.1.10.10.in-addr.arpa"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Aggressive NSEC/NSEC3 caching is enabled but DNSSEC validation is not set to 'validate', 'log-fail' or 'process', ignoring" subsystem="config" level="0" prio="Warning" tid="0" ts="1714481500.477"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Listening for queries" subsystem="config" level="0" prio="Info" tid="0" ts="1714481500.477" address="192.168.150.1" proto="UDP"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Listening for queries" subsystem="config" level="0" prio="Info" tid="0" ts="1714481500.477" address="127.0.0.1" proto="UDP"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Enabled TCP data-ready filter for (slight) DoS protection" subsystem="config" level="0" prio="Info" tid="0" ts="1714481500.477"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Listening for queries" subsystem="config" level="0" prio="Info" tid="0" ts="1714481500.477" address="192.168.150.1" protocol="TCP"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Listening for queries" subsystem="config" level="0" prio="Info" tid="0" ts="1714481500.477" address="127.0.0.1" protocol="TCP"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Operating with single distributor/worker thread" subsystem="config" level="0" prio="Notice" tid="0" ts="1714481500.478"
Apr 30 14:51:40 pdns-recursor[128737]: msg="Enabled multiplexer" subsystem="runtime" level="0" prio="Info" tid="0" ts="1714481500.479" name="epoll"
Apr 30 14:51:40 systemd[1]: Started pdns-recursor.service - PowerDNS Recursor.
Apr 30 14:51:56 pdns-recursor[128737]: msg="Not validating response for security status update, this is a non-release version" subsystem="housekeeping" level="0" prio="Warning" tid="0" ts="1714481516.469" query="recursor-4.8.7-1.Debian.security-status.secpoll.powerdns.com" version="4.8.7-1.Debian"
Apr 30 14:52:02 pdns-recursor[128737]: msg="Sending SERVFAIL during resolve" error="Too much time waiting for ns-824.awsdns-39.net|A, timeouts: 5, throttles: 0, queries: 11, 7890msec" subsystem="syncres" level="0" prio="Notice" tid="1" ts="1714481522.065" ecs="" mtid="14" proto="udp" qname="mask.apple-dns.net" qtype="A" remote="192.168.150.25:62083"
Apr 30 14:52:09 pdns-recursor[128737]: msg="Sending SERVFAIL during resolve" error="Too much time waiting for ns-844.awsdns-41.net|A, timeouts: 5, throttles: 0, queries: 19, 8421msec" subsystem="syncres" level="0" prio="Notice" tid="1" ts="1714481529.491" ecs="" mtid="21" proto="udp" qname="logfiles.zoom.us" qtype="A" remote="192.168.150.44:53154"
Apr 30 14:52:10 pdns-recursor[128737]: msg="Sending SERVFAIL during resolve" error="Too much time waiting for ns-1286.awsdns-32.org|A, timeouts: 5, throttles: 0, queries: 17, 8434msec" subsystem="syncres" level="0" prio="Notice" tid="1" ts="1714481530.551" ecs="" mtid="22" proto="udp" qname="logfiles.zoom.us" qtype="A" remote="192.168.150.44:53154"
Apr 30 14:52:27 pdns-recursor[128737]: msg="Sending SERVFAIL during resolve" error="Too much time waiting for ns3.websupport.eu|A, timeouts: 4, throttles: 0, queries: 8, 7038msec" subsystem="syncres" level="0" prio="Notice" tid="1" ts="1714481547.368" ecs="" mtid="44" proto="udp" qname="1.0.0.127.bip.virusfree.cz" qtype="A" remote="192.168.150.2:42069"
Apr 30 14:52:43 pdns-recursor[128737]: msg="Sending SERVFAIL during resolve" error="Too much time waiting for lb._dns-sd._udp.0.129.37.10.in-addr.arpa|PTR, timeouts: 5, throttles: 0, queries: 11, 8270msec" subsystem="syncres" level="0" prio="Notice" tid="1" ts="1714481563.917" ecs="" mtid="57" proto="udp" qname="lb._dns-sd._udp.0.129.37.10.in-addr.arpa" qtype="PTR" remote="192.168.150.44:60907"
Apr 30 14:53:30 pdns-recursor[128737]: msg="Sending SERVFAIL during resolve" error="Too much time waiting for 1.0.0.127.ix.dnsbl.manitu.net|A, timeouts: 4, throttles: 0, queries: 8, 7027msec" subsystem="syncres" level="0" prio="Notice" tid="1" ts="1714481610.322" ecs="" mtid="122" proto="udp" qname="1.0.0.127.ix.dnsbl.manitu.net" qtype="A" remote="192.168.150.2:50027"

How is the performance if you resolve directly from your client using dig?

Such as:

dig @8.8.8.8 ping.sunet.se

or

dig @8.8.8.8 ping.sunet.se +trace

Also do you know if its MTU 1500 or do you have something else through your ISP?

Clearly there are some timeouts, try to disable IPv6 to see if it helps.

1 Like

I also notice the recursor is noticebly slower than using an external dns server (cloudflare etcā€¦)

Can someone provide set of commands to reproduce?
And time of dig

Iā€™m currently doing some testing here and have done the following so far.

  1. deactivated IPv6 (I will reactivate this when the opportunity arises)
  2. zone to cache Zone to Cache ā€” PowerDNS Recursor documentation

I donā€™t really know which of these is responsible, but the DNS is ā€œFEELINGā€ much faster. I will try to show something measurable in the course of the next week - currently I have less time for it.

Anyway, the zone to cache should be included in the VyOS configuration as it is also part of the performance tuning of the pdns-recursor.
https://doc.powerdns.com/recursor/performance.html#running-with-a-local-root-zone

I can ask the people at PowerDNS if this is necessary for optimal performance, I have direct contacts there.

1 Like

I dump the powerdns hosted on vyos. It always get timeout. With mosdns replaced it works well. It is a long long time problem.

you maybe on to something with ā€˜zone to cacheā€™. My dns is also ā€œFEELINGā€ much faster.

Your log shows several timeouts, firewall or MTU? Maybe post your config?

So I have to say that everything is working so far and I donā€™t notice any difference to a public DNS.
However, I still donā€™t know if the deactivation of ipv6 or the addition of the ZonetoCache is responsible for this.

Before this change, the resolution times of the pdns-recursor were really unbearably slow and clearly noticeable

Iā€™ll let it continue like this for now and will switch ipv6 on again when I get the chance and see what happens.

nevertheless ZonetoCache should be added to the vyos configuration ! Is there a dev reading here who could integrate this?

1 Like

@SaulGoodman I created the feature request T6294
Do you have any idea for the CLI commands?

1 Like

i think it makes sense to have ZonetoCache as part of default lua config

or it could look like this:
set service dns forwarding options zonetocache refresh 0
orā€¦ something like this if lua scripting was exposed by CLI
set service dns forwarding lua ā€¦

Reading the manual the ZonetoCache seems to preload the zone by initiating a zonetransfer of it which most authoritive servers will block towards unknown resolvers.

So Im not sure that this really is the fix in your case and 2nd that this should be some default since it will only work in a very limited cornercase (where the authoritive server is misconfigured so it allows zonetransfer AXFR from anybody).

2 Likes

we are just talking about this part in the Performance Guide:

  • The second method is to cache the root zone as described in Zone to Cache. Here each Recursor will download and fill its cache with the contents of the root zone. Depending on the timeout parameter, this will be done once or periodically. Refer to Zone to Cache for details.

To load the root zone from Internic into the recursor once at startup and when the Lua config is reloaded:

zoneToCache(ā€œ.ā€, ā€œurlā€, ā€œhttps://www.internic.net/domain/root.zoneā€, { refreshPeriod = 0 })

The root-hints file in order to use the DNS-server as a resolver towards Internet should be manually preloaded and not each time the server starts (in order to not abuse the internic.net servers).

just to clarify the performance guide is referring to root zone file not root hints

Ah right, but I still think everybody should stop abusing the internic.net webservers for this information - request it through the DNS-system itself which its meant to.

2 Likes

Totally agree with @Apachez. The root RRā€™s are TTLā€™ed 1-6 days, so they are naturally cached by any recursive servers for days. The server operators adjust the TTL based on their load requirements. With such long TTLs the end user will have a very small performance impact from a worst case of once a day cache miss or an initial visit to a new TLD that was not visited in the past few days.