I have two networks that each have two VRRP pairs and are connected via an OSPF network. I have struggled with getting both to work in a fully converged network, as connections between the host networks behind each pair intermittently fails to connect.
I was able to work around this by disabling OSPF on the BACKUP router for each pair, which ensures that traffic only flows between two routers. I only really need VRRP to expose a public IP and break a segment in a bridge to prevent a L2 loop since pvstp is not supported, however the VRRP scripts also enable/disable OSPF.
What I think is happening is a connection from a host in network A that hits either router 1 or 2, is routed to router 1 or 2 in network B and then is received by a remote host. Then the reply from the host in network B hits either router 1 or or 2, which could be different than the gateway the connection was replied on, and is then routed to router 1 or 2 in network A before being received by the original host.
Since the SYN packet opens a TCP session on a router, if the connection is loadbalanced to the other router in the pair the unexpected packet is dropped which causes my issue.
What I would like to know is if there is a feature I’m missing, or if I need to continue troubleshooting my current setup for a slight misconfiguration.
Some details:
zone based firewall is used on all 4 routers between all network zones
Ping and TCP connections are effected
All traffic in this example scenario is in the same “ADMIN” zone, including the OSPF router network and host networks
Layer4 hashing is enabled on all 4 routers
Turning this off does not improve performance
All hosts use ipv6 autoconf from router advertisements, setting static IPv6 gateways and using a VRRP virtual IP on hosts is not preferred
MASTER VRRP router and BACKUP VRRP router have ipv6 RA default preference set to HIGH and are both listed as equal in hosts’ “ip -6 route” output
setting the BACKUP VRRP router to low default preference does not improve even when “ip -6 route” MASTER is high and BACKUP is low
*conntrac-sync is configured using the VRRP sync group, I’m not entirely sure this is working.
IPv4 connectivity is not an issue because it has to use VRRP virtual IP’s on each router.
IPv6 ping from any router directly to a remote host is sucessful, only host to host connectivity is effected.
both router pairs have an OSPF connection in area 0 with a cost of 1 between eachother, and connect to the routing switched network in area 0 with a cost of 100.
Setting the cost to 1000 for the BACKUP router’s link to the routing network does not improve, the master router still sees an equal cost path between both the BACKUP and MASTER routers of the other network.
Currently on Vyos nightly 20220218 build on all 4 routers.
It has turned out that the firewall IS infact receiving packets that have an invalid state. Setting a rule to allow invalid state between zones allows traffic to flow in the preferred converged network
However, this is not ideal, I would like conntrack-sync to make sure each router always knows of the other’s states, but I believe that the states only are imported on VRRP failover. Am i mistaken? If the states from both firewalls are supposed to always be sync’d then i just need to troubleshoot conntrack and remove these temporary invalid state rules.
Some more detail the real culprit was “set firewall state-policy invalid action drop”
turning this back on breaks traffic, turning it off restores traffic. The individual zone policy rules don’t do anything, as expected with the global rule.
obviously conntrackd sharing states “live” would be ideal so that really invalid traffic is dropped, but if not can I have a suggestion if its acceptable to allow invalid packets from “semi trusted” zones into “trusted zones”? I see a problem with invalid traffic from a DMZ making it into the ADMIN zone, however the ADMIN zone should be able to connect into the DMZ, which is how my zone rules are configured.