Active-active HA state-full firewall and DHCPv6

Dear VyOS Community, I am facing a few problems with testing VyOS 1.4 in active-active HA configuration: state-full firewall and DHCPv6. If you have any ideas on how to fix or workaround them:

Our topology:
We have 4 routers with BGP for IPv6 running in the active-active configuration:

  • Two core routers bordering the Internet and pull the AS block /36 of our public IPv6
  • OSPF between the access and core routers.
  • And two access routers with the DHCPv6 server behind the core routers, which issues public IPv6 to the host servers from the AS block. The AS block is split into networks A, B, etc.
  • Servers assigned public IPv6 from NetworkA via DHCPv6 from the Access routers.
  • Public IPv6s from NetworkB are assigned to the pods in the k8s cluster using the BGP IP pool, and NetworkB is routed through the IPv6s assigned to the servers (via NetworkA).

The Problem:

  1. DHCPv6 does not synchronize the leases between routers in the HA cluster as it currently does with DHCPv4. I guess this will be the answer ⚓ T5044 High Availability in DHCPv6 -ISC DHCP Failover/Kea
  2. All outgoing traffic is allowed. The incoming traffic is allowed only from the Whitelisted IP. We also want to configure the state-full firewall to allow outbound connections initiated from the inside to any IP on the Internet to be allowed to reply back. Now, when traffic leaves the infrastructure and is initiated to an IP on the Internet that is not Whitelisted, when the request returns, half of the traffic goes through the router that initiated the connection, and the other half goes through the HA partner and is dropped without returning to the initiator. As a result, if we ping Google, which is not IP Whitelisted from within the infrastructure, half of the packets will be dropped. So, currently, we are forced to turn off the firewall altogether, and our pods, servers, and routers are visible from the Internet. I see there is Conntrack, but as I understand it, it is only for active-passive configurations. We want to take advantage of active-active HA config while having everything completely invisible from the Internet: core and access routers, IPv6 assigned via DHCPv6 (NetA), hosts, and IPv6 advertised via BGP IP pool that are assigned to pods (NetB). As I mentioned, it should be available only to Whitelisted IPs and connections initiated from within to protect against external DDoS and hacker attacks on our infrastructure. Will the described problem be solved with this feature ⚓ T5425 enable VRF for conntrack-sync?
  3. Not related to the previous two but also very nice to have is dynamically upload IP lists from URL in our case to allow Cloudflare IP ⚓ T5493 Add capability to use local and external dynamic-lists for firewall rules but also for various policies such as access-list, route-maps etc. & How to replace ipset on vyos1.4

Can you recommend or think of any workaround for such a topology to synchronize the DHCPv6 leases and state-full firewall for an active-active cluster and dynamic lists for the firewall?

Thanks in advance,
D

I would try to configure the DHCP-server with Option82 instead.

This would also need changes in your access network where you need to enable DHCP snooping (on the L2 device) and DHCP relay (on the L3 device) but the outcome would be that there will be no lease to store or sync.

Show your “conntrack-sync” configuration.

Actually seems I we fixed second issue in our setup with conntrack-sync disable-external-cache. And now our HA pair works as active-active with asymmetric traffic (at least when I had tested it yesterday I didn’t notice any issues with traffic).
Have found topic with similar issue (Question regarding conntrack-sync - #4 by Apachez) and the answer from it helped me as well.

3 Likes

Thank you for the suggestion Apachez. What happens in case of one router from the HA dies if this option is configured?

By leasing based on Option82 it doesnt matter which DHCP-server (as long as their Option82 is configured the same way) the client request ends up at.

What happens is that DHCP-snooping at the layer2 switch will intercept the request from the downstream client and append Option82 information such as “SWITCH1-INT14”. DHCP-relay at the layer3 switch/router will force route the request as a unicast towards designated DHCP-server(s).

When the DHCP-server receive the request it will see the Option82 information and if its properly configured it will give out lease based on the Option82 information instead of the DUID/MAC-address as it would normally do.

This way whatever request a dynamic DHCP address who is located at SWITCH1-INT14 will always get the same answer as reply lets say “192.168.1.14/24”. Hence why I call this “semi-static” because the client uses DHCP but it will always get the same IP assigned as long as its connected to SWITCH1 INT14. If the client disconnects and reconnects into SWITCH1 INT15 it will get the IP-address assigned for “SWITCH1-INT15” as Option82 in the DHCP-server (lets say “192.168.1.15/24” in this example).

Since the Option82 configuration is static at each DHCP-server there is also nothing to sync between the DHCP-servers regarding leases.

Do I understand correctly that in this configuration, you are proposing to set up a dedicated new DHCP server separately, not on my existing VyOS routers displayed on the diagram? So basically, VyOS does not support HA with DHCPv6 when running DHCP server, correct @Apachez?

The proposal would be (if your layer2 and layer3 network supports DHCP snooping and DHCP relay) to setup Option82 in the DHCP-servers of VyOS.

Unfortunately I currently dont have any config example on how to achieve this in VyOS.

Dear @Viacheslav.

Description
We have established 10 internal VLANs connecting both router nodes with VRRP. Additionally, there is an external connection through a dedicated VLAN, where OSPF and load-balancing with border routers are implemented. The challenge lies in the lack of control over load-balancing for incoming packets through this external connection, resulting in their potential arrival at any of our router nodes. More details about our design are on the Forum page.

Objective
Our objective is to implement a stateful firewall configuration, allowing outgoing sessions to any IP while permitting the initiation of sessions from outside sources only from a predefined set of IPs.

Solution
Drawing inspiration from commercial firewalls like Palo Alto, VyOS 1.4 currently supports conntrack-sync only in alignment with the VRRP instance. Notably, in VyOS, multiple VRRP instances with distinct active/backup roles can exist on the same router in different instances. This suggests that conntrack-sync could potentially operate in a similar manner. The envisioned setup involves a Multiprimary setup marking each router as the session owner, sending owned sessions to the peer, and accepting only sessions owned by the HA partner. As per documentation, Multiprimary setup is supported by conntrackd on Debian 8 but not yet supported in VyOS 1.4.

Concerns:

  1. VyOS HA Conntrack-Sync is Unidirectional:
  • Conntrack-sync in VyOS HA configuration is observed to be unidirectional (Primary-Backup), limiting synchronization possibilities. The service conntrack-sync does not support multiple instances and allows configuration for only a single instance (documentation reference). Is it feasible to establish multiple unidirectional sync instances in VyOS?
  1. Bidirectional Sync with Native Conntrackd:
  • A key inquiry is whether it is practical to have multiple unidirectional sync instances (e.g., A to B and B to A) or utilize a native Multiprimary setup of the conntrackd utility in VyOS. The goal is to enhance flexibility and establish a bidirectional synchronization setup by having multiple conntrack-sync instances.

Conntrackd Documentation and Implementation:

  1. Feasibility Check:
  • Requesting verification of the feasibility of bidirectional synchronization for conntrack-sync in an active-active HA configuration.
  • If feasible, guidance or step-by-step instructions on configuring such a bidirectional setup would be greatly appreciated.
  1. Conntrackd Documentation Interpretation:
  • Seeking assistance in interpreting conntrackd documentation, specifically in the context of its support for Multiprimary setups within VyOS 1.4 (documentation reference).
  • Additionally, insights into leveraging conntrackd to achieve bidirectional synchronization are appreciated.

Your attention to these inquiries is invaluable, and any insights or guidance you can provide will greatly assist in resolving our configuration challenges.
https://vyos.dev/T5745

Regards,
Damien

Here is a quick summary of this topic and topology configuration:

Two Remaining Problems:

  1. DHCPv6 Compatibility with HA on VyOS 1.4:
  • DHCPv6 is not currently supported with High Availability (HA) on VyOS 1.4. The proposed solution involves replacing ISC DHCP with Kia. Using 1.4 as-is introduces potential complications when one router is down, especially if a lease expires during that period. The inclusion of this fix in VyOS 1.4 remains uncertain. We highly anticipate implementing this feature; it looks like it is planned only in 1.5.
  1. Conntrack-Sync Limitations with VRRP in VyOS:

Additional Feature Request:

  1. Dynamic IP Network List Download for Firewall Configuration:
  • Dynamically download a list of IP networks from an online file to add to the firewall. For us, this feature, categorized as a “nice-to-have,” is uncertain for inclusion even in VyOS 1.5. Meanwhile, we accept Python script workaround that can be periodically executed on VyOS. Hence, there is a workaround in place, and we can wait for a potential future implementation.
1 Like