Need help with conntrack and fault tolerant firewall

Hi.

I running into some stumbling blocks trying to configure vyos in my private cloud environment.

Here’s a diagram of my 8 VYOS routers…

Everything on my servers (including the CentOS servers) is OSPF (which is running fine).

Each physical host has 2 10G internet uplinks, and I need all the bandwidth I can get so I’m trying to keep all the routers active/active/active/active and load balancing as much of the traffic as I can between the 8 links.

Only ISP2 provides public IPs that I can use on any physical server, so there’s some limitations on load balancing since any traffic from the internet into my servers will always come in via ISP2, and response packets must be forwarded back thru ISP2.

Each router has 1 IP address on ISP1 on eth0 for NAT’d internet access. ISP2 delivers subnets of IPs where the ISP router uses the last IP in the block as a gateway.

Static routes to ISP2 and using load-balancing wan to get general internet traffic to go out eth0 and failover to eth1 internet if eth0 is not passing health checks.

I’m using VRRP on router1 to hold the public IPs to make sure all the public IP’s stay alive if any of the routers/physical hosts go down.

so I’m working on setting up OpenVPN over tcp/443 to the vpn vyos’s, but I’m having a problem with stateful firewall due to asymetric routing. I added nat destination rules to forward port 443 from 4 public IPs on eth1(ISP2) to 4 private VRRP IPs on vpn1-4.

I can see the TCP SYN come in router1 and get NAT’d correctly to the vpn vyos, the vpn vyos responds normally but routes the TCP SYN ACK to router2 (which SHOULD see the conntrack entry from router1) and NAT the related traffic back to the original source IP, however it appears conntrack-sync isn’t working as well as I had hoped.

Firstly, I could not get conntrack-sync to work at all with the current rolling release of vyos, it always fails to commit and the var/log/messages says “ERROR Invalid Layer 3 Protocol”. I was running conntrack-sync 1.4.5. I seen 1.4.6 was available so I manually installed that on router1-4, and then I got error messages about the “tns” unknown helpers, so I had to hack /opt/vyatta/share/perl4/Vyatta/ConntrackSync.pm to remove the part that writes the helpers into /etc/conntrackd/conntrackd.conf.

So now conntrackd is running, and I’m using the “disable-external-cache” option so that conntrack entries should get straight injected into the other 3 routers, however that doesn’t seem to be working. I am getting some but not all conntrack records on the other servers. Also the conntrackd statistics is showing lost messages.

For the record I tried extensively to do this without conntrack-sync. I just setup matching nat destination and nat source rules. that works fine for UDP traffic, however it seems that vyos will not NAT a SYN ACK that wasn’t preceeded by a SYN on the same router…

So I’m interested in alternative designs or ideas, or maybe I’m just missing something or need to adjust something.

project goals: active/active/active/active, OSPF routing, NAT with support for TCP and UDP, load-balanced WAN…

router1 config:
set firewall all-ping ‘enable’
set firewall broadcast-ping ‘disable’
set firewall config-trap ‘disable’
set firewall ipv6-receive-redirects ‘disable’
set firewall ipv6-src-route ‘disable’
set firewall ip-src-route ‘disable’
set firewall log-martians ‘enable’
set firewall name router_local default-action ‘drop’
set firewall name router_local enable-default-log
set firewall name router_local rule 1 action ‘accept’
set firewall name router_local rule 1 protocol ‘icmp’
set firewall name router_local rule 2 action ‘accept’
set firewall name router_local rule 2 protocol ‘ospf’
set firewall name router_local rule 3 action ‘accept’
set firewall name router_local rule 3 protocol ‘vrrp’
set firewall name router_local rule 4 action ‘accept’
set firewall name router_local rule 4 destination port ‘53’
set firewall name router_local rule 4 protocol ‘udp’
set firewall name router_local rule 5 action ‘accept’
set firewall name router_local rule 5 protocol ‘udp’
set firewall name router_local rule 5 source port ‘53’
set firewall name router_local rule 6 action ‘accept’
set firewall name router_local rule 6 destination port ‘123’
set firewall name router_local rule 6 protocol ‘udp’
set firewall name router_local rule 7 action ‘accept’
set firewall name router_local rule 7 destination port ‘22’
set firewall name router_local rule 7 protocol ‘tcp’
set firewall name router_local rule 7 source address ‘10.0.48.1-10.0.48.4’
set firewall name router_local rule 8 action ‘accept’
set firewall name router_local rule 8 destination address ‘225.0.0.50’
set firewall name router_local rule 8 protocol ‘udp’
set firewall receive-redirects ‘disable’
set firewall send-redirects ‘enable’
set firewall source-validation ‘disable’
set firewall state-policy established action ‘accept’
set firewall state-policy established log enable
set firewall state-policy invalid action ‘accept’
set firewall state-policy invalid log enable
set firewall state-policy related action ‘accept’
set firewall state-policy related log enable
set firewall syn-cookies ‘disable’
set firewall twa-hazards-protection ‘disable’
set high-availability vrrp group vlan0001 interface ‘eth1’
set high-availability vrrp group vlan0001 priority ‘255’
set high-availability vrrp group vlan0001 virtual-address ‘10.0.31.254/20’
set high-availability vrrp group vlan0001 vrid ‘254’
set high-availability vrrp group vlan0002 interface ‘eth1.2’
set high-availability vrrp group vlan0002 priority ‘254’
set high-availability vrrp group vlan0002 virtual-address ‘10.0.47.254/20’
set high-availability vrrp group vlan0002 vrid ‘254’
set high-availability vrrp group vlan0003 interface ‘eth1.3’
set high-availability vrrp group vlan0003 priority ‘254’
set high-availability vrrp group vlan0003 virtual-address ‘10.0.63.254/20’
set high-availability vrrp group vlan0003 vrid ‘254’
set high-availability vrrp group vrid01 interface ‘eth1’
set high-availability vrrp group vrid01 priority ‘255’
set high-availability vrrp group vrid01 virtual-address ‘10.0.0.1/20’
set high-availability vrrp group vrid01 virtual-address ‘3.3.3.1/27’
set high-availability vrrp group vrid01 virtual-address ‘2.2.186.137/29’
set high-availability vrrp group vrid01 vrid ‘1’
set high-availability vrrp group vrid02 interface ‘eth1’
set high-availability vrrp group vrid02 priority ‘254’
set high-availability vrrp group vrid02 virtual-address ‘10.0.0.2/20’
set high-availability vrrp group vrid02 virtual-address ‘3.3.3.2/27’
set high-availability vrrp group vrid02 virtual-address ‘2.2.186.138/29’
set high-availability vrrp group vrid02 vrid ‘2’
set high-availability vrrp group vrid03 interface ‘eth1’
set high-availability vrrp group vrid03 priority ‘254’
set high-availability vrrp group vrid03 virtual-address ‘10.0.0.3/20’
set high-availability vrrp group vrid03 virtual-address ‘3.3.3.3/27’
set high-availability vrrp group vrid03 vrid ‘3’
set high-availability vrrp group vrid04 interface ‘eth1’
set high-availability vrrp group vrid04 priority ‘254’
set high-availability vrrp group vrid04 virtual-address ‘10.0.0.4/20’
set high-availability vrrp group vrid04 virtual-address ‘3.3.3.4/27’
set high-availability vrrp group vrid04 vrid ‘4’
set high-availability vrrp group vrid17 interface ‘eth1’
set high-availability vrrp group vrid17 priority ‘255’
set high-availability vrrp group vrid17 virtual-address ‘2.2.158.137/27’
set high-availability vrrp group vrid17 virtual-address ‘3.3.3.241/28’
set high-availability vrrp group vrid17 vrid ‘17’
set high-availability vrrp group vrid18 interface ‘eth1’
set high-availability vrrp group vrid18 priority ‘253’
set high-availability vrrp group vrid18 virtual-address ‘2.2.158.138/27’
set high-availability vrrp group vrid18 virtual-address ‘3.3.3.242/28’
set high-availability vrrp group vrid18 vrid ‘18’
set high-availability vrrp group vrid19 interface ‘eth1’
set high-availability vrrp group vrid19 priority ‘253’
set high-availability vrrp group vrid19 virtual-address ‘3.3.3.243/28’
set high-availability vrrp group vrid19 vrid ‘19’
set high-availability vrrp group vrid20 interface ‘eth1’
set high-availability vrrp group vrid20 priority ‘253’
set high-availability vrrp group vrid20 virtual-address ‘3.3.3.244/28’
set high-availability vrrp group vrid20 vrid ‘20’
set high-availability vrrp sync-group syncgrp member ‘vlan0001’
set interfaces ethernet eth0 address ‘1.1.1.167/24’
set interfaces ethernet eth0 hw-id ‘02:00:00:c3:1f:29’
set interfaces ethernet eth0 mtu ‘1500’
set interfaces ethernet eth1 address ‘10.0.15.253/20’
set interfaces ethernet eth1 address ‘10.0.31.253/20’
set interfaces ethernet eth1 address ‘3.3.3.253/28’
set interfaces ethernet eth1 firewall local name ‘router_local’
set interfaces ethernet eth1 hw-id ‘c0:ff:ee:00:1f:fd’
set interfaces ethernet eth1 ip ospf cost ‘1’
set interfaces ethernet eth1 ip ospf dead-interval ‘40’
set interfaces ethernet eth1 ip ospf hello-interval ‘10’
set interfaces ethernet eth1 ip ospf priority ‘255’
set interfaces ethernet eth1 ip ospf retransmit-interval ‘5’
set interfaces ethernet eth1 ip ospf transmit-delay ‘1’
set interfaces ethernet eth1 mtu ‘9000’
set interfaces ethernet eth1 vif 2 address ‘10.0.47.253/20’
set interfaces ethernet eth1 vif 2 firewall local name ‘router_local’
set interfaces ethernet eth1 vif 2 ip ospf cost ‘2’
set interfaces ethernet eth1 vif 2 ip ospf dead-interval ‘40’
set interfaces ethernet eth1 vif 2 ip ospf hello-interval ‘10’
set interfaces ethernet eth1 vif 2 ip ospf priority ‘254’
set interfaces ethernet eth1 vif 2 ip ospf retransmit-interval ‘5’
set interfaces ethernet eth1 vif 2 ip ospf transmit-delay ‘1’
set interfaces ethernet eth1 vif 2 mtu ‘9000’
set interfaces ethernet eth1 vif 3 address ‘10.0.63.253/20’
set interfaces ethernet eth1 vif 3 firewall local name ‘router_local’
set interfaces ethernet eth1 vif 3 ip ospf cost ‘3’
set interfaces ethernet eth1 vif 3 ip ospf dead-interval ‘40’
set interfaces ethernet eth1 vif 3 ip ospf hello-interval ‘10’
set interfaces ethernet eth1 vif 3 ip ospf priority ‘254’
set interfaces ethernet eth1 vif 3 ip ospf retransmit-interval ‘5’
set interfaces ethernet eth1 vif 3 ip ospf transmit-delay ‘1’
set interfaces ethernet eth1 vif 3 mtu ‘9000’
set load-balancing wan disable-source-nat
set load-balancing wan flush-connections
set load-balancing wan interface-health eth0 failure-count ‘1’
set load-balancing wan interface-health eth0 nexthop ‘1.1.1.254’
set load-balancing wan interface-health eth0 success-count ‘1’
set load-balancing wan interface-health eth0 test 10 resp-time ‘1’
set load-balancing wan interface-health eth0 test 10 target ‘4.2.2.1’
set load-balancing wan interface-health eth0 test 10 ttl-limit ‘1’
set load-balancing wan interface-health eth0 test 10 type ‘ping’
set load-balancing wan interface-health eth0 test 20 resp-time ‘1’
set load-balancing wan interface-health eth0 test 20 target ‘8.8.4.4’
set load-balancing wan interface-health eth0 test 20 ttl-limit ‘1’
set load-balancing wan interface-health eth0 test 20 type ‘ping’
set load-balancing wan interface-health eth1 failure-count ‘1’
set load-balancing wan interface-health eth1 nexthop ‘3.3.3.254’
set load-balancing wan interface-health eth1 success-count ‘1’
set load-balancing wan interface-health eth1 test 10 resp-time ‘1’
set load-balancing wan interface-health eth1 test 10 target ‘8.8.8.8’
set load-balancing wan interface-health eth1 test 10 ttl-limit ‘1’
set load-balancing wan interface-health eth1 test 10 type ‘ping’
set load-balancing wan interface-health eth1 test 20 resp-time ‘1’
set load-balancing wan interface-health eth1 test 20 target ‘4.2.2.2’
set load-balancing wan interface-health eth1 test 20 ttl-limit ‘1’
set load-balancing wan interface-health eth1 test 20 type ‘ping’
set load-balancing wan rule 1 exclude
set load-balancing wan rule 1 inbound-interface ‘eth1.2’
set load-balancing wan rule 1 protocol ‘udp’
set load-balancing wan rule 1 source address ‘10.0.33.1-10.0.33.4’
set load-balancing wan rule 1 source port ‘53’
set load-balancing wan rule 2 exclude
set load-balancing wan rule 2 inbound-interface ‘eth1.2’
set load-balancing wan rule 2 protocol ‘udp’
set load-balancing wan rule 2 source address ‘10.0.34.1-10.0.34.4’
set load-balancing wan rule 2 source port ‘1194’
set load-balancing wan rule 3 exclude
set load-balancing wan rule 3 inbound-interface ‘eth1.2’
set load-balancing wan rule 3 protocol ‘tcp’
set load-balancing wan rule 3 source address ‘10.0.35.1-10.0.35.4’
set load-balancing wan rule 3 source port ‘443’
set load-balancing wan rule 9990 destination address ‘10.0.0.0/8’
set load-balancing wan rule 9990 exclude
set load-balancing wan rule 9990 inbound-interface ‘eth+’
set load-balancing wan rule 9990 protocol ‘all’
set load-balancing wan rule 9997 exclude
set load-balancing wan rule 9997 inbound-interface ‘eth1’
set load-balancing wan rule 9997 protocol ‘all’
set load-balancing wan rule 9998 exclude
set load-balancing wan rule 9998 inbound-interface ‘eth0’
set load-balancing wan rule 9998 protocol ‘all’
set load-balancing wan rule 9999 failover
set load-balancing wan rule 9999 inbound-interface ‘eth+’
set load-balancing wan rule 9999 interface eth0 weight ‘10’
set load-balancing wan rule 9999 interface eth1 weight ‘1’
set load-balancing wan rule 9999 protocol ‘all’
set nat destination rule 11 destination address ‘2.2.158.137’
set nat destination rule 11 destination port ‘53’
set nat destination rule 11 inbound-interface ‘eth1’
set nat destination rule 11 log ‘enable’
set nat destination rule 11 protocol ‘udp’
set nat destination rule 11 translation address ‘10.0.33.1’
set nat destination rule 12 destination address ‘2.2.158.138’
set nat destination rule 12 destination port ‘53’
set nat destination rule 12 inbound-interface ‘eth1’
set nat destination rule 12 log ‘enable’
set nat destination rule 12 protocol ‘udp’
set nat destination rule 12 translation address ‘10.0.33.2’
set nat destination rule 13 destination address ‘2.2.186.137’
set nat destination rule 13 destination port ‘53’
set nat destination rule 13 inbound-interface ‘eth1’
set nat destination rule 13 log ‘enable’
set nat destination rule 13 protocol ‘udp’
set nat destination rule 13 translation address ‘10.0.33.3’
set nat destination rule 14 destination address ‘2.2.186.138’
set nat destination rule 14 destination port ‘53’
set nat destination rule 14 inbound-interface ‘eth1’
set nat destination rule 14 log ‘enable’
set nat destination rule 14 protocol ‘udp’
set nat destination rule 14 translation address ‘10.0.33.4’
set nat destination rule 15 destination address ‘3.3.3.241’
set nat destination rule 15 destination port ‘1194’
set nat destination rule 15 inbound-interface ‘eth1’
set nat destination rule 15 log ‘enable’
set nat destination rule 15 protocol ‘udp’
set nat destination rule 15 translation address ‘10.0.34.1’
set nat destination rule 16 destination address ‘3.3.3.242’
set nat destination rule 16 destination port ‘1194’
set nat destination rule 16 inbound-interface ‘eth1’
set nat destination rule 16 log ‘enable’
set nat destination rule 16 protocol ‘udp’
set nat destination rule 16 translation address ‘10.0.34.2’
set nat destination rule 17 destination address ‘3.3.3.243’
set nat destination rule 17 destination port ‘1194’
set nat destination rule 17 inbound-interface ‘eth1’
set nat destination rule 17 log ‘enable’
set nat destination rule 17 protocol ‘udp’
set nat destination rule 17 translation address ‘10.0.34.3’
set nat destination rule 18 destination address ‘3.3.3.244’
set nat destination rule 18 destination port ‘1194’
set nat destination rule 18 inbound-interface ‘eth1’
set nat destination rule 18 log ‘enable’
set nat destination rule 18 protocol ‘udp’
set nat destination rule 18 translation address ‘10.0.34.4’
set nat destination rule 19 destination address ‘3.3.3.241’
set nat destination rule 19 destination port ‘443’
set nat destination rule 19 inbound-interface ‘eth1’
set nat destination rule 19 log ‘enable’
set nat destination rule 19 protocol ‘tcp’
set nat destination rule 19 translation address ‘10.0.35.1’
set nat destination rule 20 destination address ‘3.3.3.242’
set nat destination rule 20 destination port ‘443’
set nat destination rule 20 inbound-interface ‘eth1’
set nat destination rule 20 log ‘enable’
set nat destination rule 20 protocol ‘tcp’
set nat destination rule 20 translation address ‘10.0.35.2’
set nat destination rule 21 destination address ‘3.3.3.243’
set nat destination rule 21 destination port ‘443’
set nat destination rule 21 inbound-interface ‘eth1’
set nat destination rule 21 log ‘enable’
set nat destination rule 21 protocol ‘tcp’
set nat destination rule 21 translation address ‘10.0.35.3’
set nat destination rule 22 destination address ‘3.3.3.244’
set nat destination rule 22 destination port ‘443’
set nat destination rule 22 inbound-interface ‘eth1’
set nat destination rule 22 log ‘enable’
set nat destination rule 22 protocol ‘tcp’
set nat destination rule 22 translation address ‘10.0.35.4’
set nat source rule 1 destination address ‘10.0.0.0/19’
set nat source rule 1 exclude
set nat source rule 1 log ‘enable’
set nat source rule 1 outbound-interface ‘eth1’
set nat source rule 9998 outbound-interface ‘eth1’
set nat source rule 9998 translation address ‘3.3.3.253’
set nat source rule 9999 outbound-interface ‘eth0’
set nat source rule 9999 translation address ‘masquerade’
set protocols ospf area 0 network ‘10.0.0.0/8’
set protocols ospf default-information originate metric ‘2’
set protocols ospf default-information originate metric-type ‘2’
set protocols ospf parameters abr-type ‘cisco’
set protocols ospf parameters router-id ‘10.0.15.253’
set protocols static route 0.0.0.0/0 next-hop 1.1.1.254 distance ‘100’
set protocols static route 0.0.0.0/0 next-hop 3.3.3.254 distance ‘10’
set protocols static route 4.2.2.1/32 next-hop 1.1.1.254
set protocols static route 4.2.2.2/32 next-hop 3.3.3.254
set protocols static route 8.8.4.4/32 next-hop 1.1.1.254
set protocols static route 8.8.8.8/32 next-hop 3.3.3.254
set service conntrack-sync accept-protocol ‘tcp,udp,icmp’
set service conntrack-sync disable-external-cache
set service conntrack-sync event-listen-queue-size ‘16’
set service conntrack-sync failover-mechanism vrrp sync-group ‘syncgrp’
set service conntrack-sync interface eth1
set service conntrack-sync listen-address ‘10.0.15.253’
set service conntrack-sync mcast-group ‘225.0.0.50’
set service conntrack-sync sync-queue-size ‘16’
set system config-management commit-revisions ‘100’
set system conntrack expect-table-size ‘2048’
set system conntrack hash-size ‘32768’
set system conntrack ignore rule 1 destination address ‘224.0.0.0/4’
set system conntrack ignore rule 2 destination address ‘10.0.0.0/8’
set system conntrack ignore rule 2 source address ‘10.0.0.0/8’
set system conntrack table-size ‘262144’
set system conntrack tcp half-open-connections ‘512’
set system conntrack tcp loose ‘disable’
set system conntrack tcp max-retrans ‘3’
set system console device hvc0 speed ‘115200’
set system host-name ‘router1’

PS: i have tried with conntrack tcp loose enable and disable… neither seemed to work

is @dmadole a user here? he helped me out big time on vyatta about a decade ago!

i made a new diagram that is maybe better…

I am not very active, but yes, I am here. I am not sure I have time to understand your setup, but here are a few thoughts.

This seems like a very complex way to do this. It seems like it would be a lot simpler to just get get two boxes that could handle the 10 gig throughput rather than trying to scale out.

Your ISPs can’t run BGP so that you can use the same address space through both connections?

I have never found WAN load balancing to work satisfactorily on Vyos, just as it never did on Vyatta. I think there are both bugs and design issues. Ubiquiti has made a lot of improvements; I have used EdgeRouter successfully for setups that need load balancing.

I’m not sure this is the right use case for conntrack sync. Even if it worked, it seems like you would always be susceptible to race cases between your datastream and the conntrack updates. I have always looked at conntrack for managing failover, not to get active/active to work, but I could be wrong on this.

Thanks for your reply dave! glad you’re still out there! No worries about understanding the whole thing…

Part of the limitations are that I am leasing the physical hardware in a datacenter from the ISP. each physical box has 2 10G nics, so in order to use all 80Gbits i need to have atleast 1 vyos on each physical server to handle the public IPs. The public IPs are done using static IPs, static routes, and good old ARP. The ISP doesn’t offer any dynamic routing protocols however all public IPs are available to any physical server on eth1, so there isn’t really a need for dynamic routing between me and the ISP. whichever VYOS ARPs for the public IP is the one that the inbound traffic flows to. if i assign the same IP to all 4 vyos then they all reply to ARP and the traffic flows to whichever box responds the slowest, which obviously would lead to more problems. So I’m using VRRP to manage the public IP Addresses on eth1 to make sure that every public IP stays active as long as 1 vyos router on 1 physical host is online. I am fine with doing 1-to-1 NAT to match up the public IPs to private IPs, however due to OSPF and redundant default routes, I get asymmetric routing as expected, however this causes problems with the system not NATing a TCP SYN ACK when that exact router didn’t see the preceeding TCP SYN. Is there maybe a way to disable stateful packet inspection entirely but still restrict protocols and ports and translate IPs?

Here’s what show conntrack-sync statistics shows on each router1-4

I’m using edgerouters as my VPN clients. I wanted a plug and play system where someone can plug an edgerouter into any DHCP network with internet access so it can “phone home” to these VPN servers in the datacenter. I also want to support roadwarrior vpn clients. I would like for everything to run on degraded bandwidth if 3 of the 4 physical servers are down.

so if conntrack sync is not the way to achieve active/active redundancy what is?

I’m not opposed to building more VYOS’s if needed… like if i want to seperate routing functions from nat functions from the load balancing… i chose to seperate VPN from other functions due to the encryption overhead. if a VPN server is bogged down I don’t want it to affect the servers reaching the internet thru the core routers…

I’m trying some things now… I hacked the /opt/vyatta/share/perl5/Vyatta/ConntrackSync.pm again this time to make it stop writing to /etc/conntrackd/conntrackd.conf so that I can edit that file manually. and experiement with some of the options that conntrackd offers that aren’t shown in vyos.

still no luck… i’ve been banging my head on this problem for about 7 days… any help is appreciated…

to anyone: if you had 4 physical servers to run vyos on and you wanted to make them a cluster, how would you do it?

Why don’t you just put public IPs directly on the VPN routers and not mess with the NAT in the middle?

Or, since you are doing 1:1 NAT, just use stateless NAT built with, perhaps, nftables (not using Vyos).

There’s no reason to not put the public IPs right on the VPN routers you are right. I think I will do that, however that till doesn’t help me when I’m adding NAT rules to other services like webservers and things.

The VYOS I have exclusive control, but the Virtual servers I do not. I currently have virtual servers directly connected with Public IP and I don’t like it because they have the ability to mess things up for everyone else like assigning IP’s they aren’t supposed to.

I am fine with 1:1 NAT, and I’m fine with it being stateless too, so I’ll have to look into nftables.

for what it’s worth, I’ve completely disabled conntrack-sync now, and I’ve gone to using a back-to-back NAT solution which I dont really like… i think i need to redesign this whole thing

if this looks right then I think for my “load-balancing” needs I will use HA proxy seperately to handle the inbound load balancing amongst servers, the wan load balancing does seem to work as advertised using the configuration i posted above. and I will attach the VPN routers to eth0 and eth1 as you suggested and make them have VPN clients masquerade using vpn1-4’s local addresses. thanks

I built the system as shown in the diagram above and everything is now working OK… trying to mix the inbound and outbound load balancing was tough on one system. in the end it is working nicer having that. so my 8 vyos’s turned into 12! thanks!