Can I use conntrack-sync to firewall an asymmetric deployment (AWS VPN)

chrismarget · December 3, 2020, 6:00pm

It looks to me like the conntrack-sync feature works best when:

VyOS boxes are in an active/standby configuration
VRRP is running on all transit interfaces
VRRP is synchronized so that a single VyOS node is the VRRP active node for all interfaces

All traffic flows through a single VyOS node until it fails. On failure of the active node, VRRP on the surviving node:

assumes the gateway address on all transit interfaces
triggers conntrackd to dump flow state information previously collected from the failed node into the survivor’s conntrack table.

Because conntrackd is caching flow updates (not injecting them into the kernel on the receiving system in real time), this configuration of conntrack-sync is necessarily an active/standby mechanism, closely coupled with VRRP.

Is that a reasonable summary of the basic capability?

I’m trying to figure out whether I can make use of conntrack-sync for firewalling AWS site-to-site VPN connections within a topology like this:

Because I can’t reliably influence traffic to force flows only to the “active” VyOS node, populating the conntrack table with a VRRP-based trigger isn’t very helpful.

My experiments with using conntrackd's DisableExternalCache feature to share flow state in real time are promising, but leave me with two questions:

If conntrackd is injecting entries in real time so that return traffic can hit either router, what’s the role of the failover mechanism? Why is VRRP here?
It looks like there’s a race at the start of a new asymmetric flow: Which will arrive first, the SYN/ACK, or the state table entry which allows it?

What’s the right way forward here?

Decouple the firewall from the VPN, build the firewall as an active/standby pair with VRRP on both sides (symmetry!)
Use routing metrics (BGP and OSPF) to force traffic onto one side, rely on DisableExternalCache (AKA risk the race condition) only during the short windows of OSPF/BGP reconvergence.
Something else?