[VyOS 1.4 - epa2]Domain-group policy routing fails when clients use external DNS servers

Description:
Domain-group based policy routing currently only works when clients use VyOS’s DNS resolver. When clients use external DNS servers (e.g., 8.8.8.8, 1.1.1.1, or DoH/DoT), the resolved IP addresses may differ from VyOS’s cache, causing policy routing to fail for those domains.

Steps to Reproduce:

  1. Configure a domain-group containing a FQDN (e.g., chat.openai.com)
  2. Set up policy routing for traffic matching that domain-group
  3. On a client machine, configure an external DNS server
  4. Verify the client gets different IPs than VyOS for the domain
  5. Observe traffic to that domain bypasses the policy route

Expected Behavior:
VyOS should route based on the FQDN itself , or should otherwise reconcile client and gateway DNS results so that domain‑based routing always works, regardless of the client’s chosen DNS.

Actual Behavior:
Policy routing only works when Clients use VyOS’s DNS resolver, or the external DNS returns identical IPs to VyOS’s resolver

Workarounds Tried:

  1. Forcing all DNS through VyOS (e.g., NAT‑redirecting port 53) — not acceptable in my environment.
  2. Static IP lists in the domain-group (requires constant manual updates).

Questions / Feature Request:

  1. Is there an existing VyOS configuration or recommended plugin that enables true FQDN‑based routing without forcing all client DNS through the VyOS?
  2. Are there plans to add deep‑packet inspection (e.g., matching TLS SNI or HTTP Host headers) to domain-group so that routing can be applied directly based on domain names, regardless of client DNS settings?

Thank you for any guidance or suggestions!

When using domain based routing you should already be aware you might be doing something that is incorrect in the first place.

With the current state of the internet, DNS records can change depending on a lot of parameters and you cannot expect something like a router to take that into account. It would require continuous pooling of DNS for changes and some sort of knowledge about the strategy used for the propagation of IPs by the owner of the domain. Thee could change the records based on load, pricing, time, maintenance etc…

This feature was maybe useful 15 years ago, but IMHO it should be removed from VyOS for exactly the problems you stated.

Regards,

Sander

As I know the 1.4 epa doesn’t have ability to configure policy routing based and domain-group.
Something is wrong here :slight_smile:
Or you use rolling instead.
In any case some domains based on CDN/balancers. For example Google has million IPs but it will resolve to one based on geolocation, latency, lossses, etc
So you simple cannot use this for policy or firewalling. A set will resolve to different IPs each time where they want :wink:
For static domains it is ok

Also if client uses another DNS and you do not force intercept/redirect this traffic to VyOS they can have another resolve addresses.

Without configuration/topology and examples it is difficult to say but yeah something is wrong with original idea. I’d start with this.
I do not see any bugs related to VyOS at the moment.

1 Like

This is exactly the scenario I’m trying to address using VyOS. Since many services are hosted behind CDNs or load balancers, the same domain name can resolve to different IP addresses, which frequently change. When granular traffic control is required for specific services or domains, relying solely on IP prefix-based routing in VyOS becomes both cumbersome and difficult to maintain.

To solve this, I tried adding the target domains to VyOS firewall domain groups and applying policy-based routing to direct the corresponding traffic to different egress interfaces. However, VyOS appears to lack the deep packet inspection (DPI) capabilities provided by devices such as FortiGate or Palo Alto, which often results in Domain‑Based Routing not working during real-world testing.

Here is my configuration:

set firewall group domain-group US-EGRESS address 'linkedin.com'
set firewall group domain-group US-EGRESS address 'app.loomly.com'
set firewall group domain-group SECURITY-BLOCK address 'vyos.net'

set policy route DOMAIN-POLICY interface 'eth2'
set policy route DOMAIN-POLICY rule 10 destination group domain-group 'US-EGRESS'
set policy route DOMAIN-POLICY rule 10 set table '10'

set policy route DOMAIN-POLICY rule 100 action 'reject'
set policy route DOMAIN-POLICY rule 100 destination group domain-group 'SECURITY-BLOCK'

set protocols static table 10 route 0.0.0.0/0 next-hop 172.16.50.1

Some quick testing and there does appear to be a bug. Using linkedin.com as an example. When you create the domain-group, the address is populated correctly in the mangle table:

set firewall group domain-group linkedin address linkedin.com
vyos@vyos# sudo nft list set vyos_mangle D_linkedin
table ip vyos_mangle {
        set D_linkedin {
                type ipv4_addr
                flags interval
                elements = { 150.171.22.12 }
        }
}

Once you make a change to a policy route, the elements are empty:

vyos@vyos# set policy route test interface eth1
vyos@vyos# set policy route test rule 10 destination group domain-group linkedin 
vyos@vyos# set policy route test rule 10 set dscp 63
vyos@vyos# sudo nft list set vyos_mangle D_linkedin
table ip vyos_mangle {
        set D_linkedin {
                type ipv4_addr
                flags interval
        }
}

If you make an arbitrary change to the firewall config, then the element is repopulated again and PBR works:

vyos@vyos# set firewall group domain-group linkedin description "linkedin"
vyos@vyos# sudo nft list set vyos_mangle D_linkedin
table ip vyos_mangle {
        set D_linkedin {
                type ipv4_addr
                flags interval
                elements = { 150.171.22.12 }
        }
}
sudo tcpdump -i eth0 'ip[1] = 0xfc'
tcpdump: listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
11:27:59.233363 IP (tos 0xfc, ttl 62, id 0, offset 0, flags [DF], proto TCP (6), length 162)
    x.x.x.x.64204 > 150.171.22.12.443: Flags [P.], cksum 0x460a (correct), seq 2857930634:2857930756, ack 483909214, win 4096, length 122

This is present on rolling and 1.4.2

1 Like