Can we configure VRRP or HA when cluster IP is on different subnet?

blason · September 11, 2023, 5:55pm

Hi Team,

Unfortunately my ISP has given me a /29 subnet and I only now have 1 IP address left from that pool. I wonder if I can configure cluster or VRRP where physical IPs can be on different subnet and VIP is on other?

like
R1
eth0 : 172.16.1.2/24

R2
eth0: 172.16.1.3/24

VIP/Cluster IP : 20.20.20.20/29

TIA
Blason R

Apachez · September 11, 2023, 6:40pm

1: Ask your ISP if they cannot provide a linklocal or RFC1918 as linknet for routing and route that full /29 towards your end of that linknet. Because then you can use all 8 IP-addresses (of this /29 range) for services and not just 6 of them (since having that /29 onlink will waste the first IP as net-address and last IP as broadcast-address).

For example:

ISP: 169.254.1.1/30
YOU: 169.254.1.2/30

You route default (0.0.0.0/0) to 169.254.1.1.

ISP route x.x.x.x/29 to 169.254.1.2.

2: I think this should work just fine.

Arista for example supports this where if you dont define a subnetmask for the VIP it will use the same as the physical interfaces.

For example, VIP: 192.168.0.1 becomes:

VIP: 192.168.0.1/29
R1: 192.168.0.2/29
R2: 192.168.0.3/29

Where VIP: 10.0.0.1/24 becomes:

VIP: 10.0.0.1/24
R1: 192.168.0.2/29
R2: 192.168.0.3/29

What happens in the background for that 2nd case (which you ask about) is that a selective proxy-arp will be used so when someone on that VLAN “arp-whohas 10.0.0.1?” it will reply with the VIP mac-address (either a virtual one or the same as the physical interface depending on your settings - either way a GARP will be sent when passive box becomes active).

Another option in your case instead of using VRRP/HA is to do a simple DNAT. That will however only forward traffic to a single host at your DMZ/LAN.

Something like:

Just incoming DNAT:

set nat destination rule 10 destination address '20.20.20.20'
set nat destination rule 10 inbound-interface 'eth1'
set nat destination rule 10 translation address '192.168.0.1'

Or a 1:1 NAT (that is the device on DMZ/LAN will use this 20.20.20.20 as SNAT aswell for outbound traffic):

set nat static rule 20 destination address '20.20.20.20'
set nat static rule 20 inbound-interface 'eth1'
set nat static rule 20 translation address '192.168.0.1'

h00k · September 11, 2023, 8:11pm

i imagine your devices and your telco next hop are all on a dummy l2 switch

cant we just configure sub interfaces for the vrrp heartbeats? and then rather tracking an interface going up/down we use a health script.

or run vrrp heartbeats just on some interfaces that arent your primary uplinks?

blason · September 12, 2023, 3:59am

That is like a more achievable option though let me see how can it be implemented. So, its not possible with plain VRRP configuration, right? I mean this is achievable in checkpoint firewalls and since checkpoint uses linux in backend I guess it might be possible with vyos as well?
https://sc1.checkpoint.com/documents/R80.40/WebAdminGuides/EN/CP_R80.40_ClusterXL_AdminGuide/Topics-CXLG/Cluster-IP-addresses-on-different-subnets.htm

Apachez · September 12, 2023, 7:10am

Have you tried something like this?

set high-availability virtual-server 2.2.2.2 address '2.2.2.2'
set high-availability virtual-server 2.2.2.2 algorithm 'source-hashing'
set high-availability virtual-server 2.2.2.2 delay-loop '10'
set high-availability virtual-server 2.2.2.2 forward-method 'direct'
set high-availability virtual-server 2.2.2.2 persistence-timeout '3600'
set high-availability virtual-server 2.2.2.2 port '8080'
set high-availability virtual-server 2.2.2.2 protocol 'tcp'
set high-availability virtual-server 2.2.2.2 real-server 192.168.1.101 port '80'
set high-availability virtual-server 2.2.2.2 real-server 192.168.1.102 port '80'

The above should bring you that stuff going for 2.2.2.2:8080 is loadbalanced towards TCP80 on 192.168.1.101 and 192.168.1.102.

Direct-method because you want to keep the source-ip when hitting the internal server.

Apachez · September 12, 2023, 7:12am

If the virtual server is within the range of the interface but not the interface ip itself you probably need to enable proxy-arp, local proxy-arp or selective proxy-arp depending on your choice.

blason · September 13, 2023, 1:58am

Well thats a good option and have not tried that. Thanks for the heads up though

blason · September 13, 2023, 2:01am

By the way how do I failover the entire vyos instahce in case of interface failure? Like in my below scenario

R1
eth0 : 10.1.1.2
eth1 : 10.30.30.30.2

R2
eth0 : 10.1.1.3
eth1 : 10.30.30.30.3

VIP
eth0 : 10.1.1.1
eth1 : 10.30.30.30.1

VRRP Group 10
VRRP Group 20

So incase of eth0 failed only eth0 from R2 shows as a master while eth1 stays as a master on R1 hence my connectivity does not work. Am I missing anything?

Apachez · September 13, 2023, 7:06am

Use sync-group where you put both vrrp groups as members:

https://docs.vyos.io/en/latest/configuration/highavailability/index.html#sync-groups

Also using some kind of health script is probably wanted too (for example if VYOS2 can reach the upstream gateway but VYOS1 cannot its time to failover to VYOS2):

https://docs.vyos.io/en/latest/configuration/highavailability/index.html#health-check-scripts

blason · September 13, 2023, 6:40pm

Well - it worked in my scenario directly without much hassle. Can I really go into production with this setup? Can anyone please confirm or was that really a fluke?

R1
eth3 and eth4 are configured as VRRP

Interface        IP Address                        S/L  Description
---------        ----------                        ---  -----------
eth0             192.168.5.16/24                   u/u
eth1             10.10.10.16/24                    u/u
eth2             10.10.20.16/24                    u/u
eth3             169.254.10.1/24                   u/u
                 10.10.40.20/24
eth4             192.168.41.17/24                  u/u
                 192.168.41.20/24

R2
vyos@R2# run show interfaces
Codes: S - State, L - Link, u - Up, D - Down, A - Admin Down
Interface        IP Address                        S/L  Description
---------        ----------                        ---  -----------
eth0             10.10.10.17/24                    u/u
eth1             10.10.20.17/24                    u/u
eth2             169.254.10.2/24                   u/u
eth3             192.168.41.16/24                  u/u

My eth3 and eth2 on R1 and R2 are configured with link-local and here is VRRP config

R1
vyos@R1# run show vrrp
  Name  Interface      VRID  State      Priority  Last Transition
------  -----------  ------  -------  ----------  -----------------
    10  eth3             10  MASTER          100  3m21s
    20  eth4             20  MASTER          100  3m21s

R2
vyos@R2# run show vrrp
  Name  Interface      VRID  State      Priority  Last Transition
------  -----------  ------  -------  ----------  -----------------
    10  eth2             10  BACKUP           90  3m38s
    20  eth3             20  BACKUP           90  3m38s

Configuration commands

R1
set high-availability vrrp group 10 interface 'eth3'
set high-availability vrrp group 10 priority '100'
set high-availability vrrp group 10 virtual-address 10.10.40.20/24
set high-availability vrrp group 10 vrid '10'
set high-availability vrrp group 20 interface 'eth4'
set high-availability vrrp group 20 priority '100'
set high-availability vrrp group 20 virtual-address 192.168.41.20/24
set high-availability vrrp group 20 vrid '20'
set high-availability vrrp sync-group UP member '10'
set high-availability vrrp sync-group UP member '20'

R2
set high-availability vrrp group 10 interface 'eth2'
set high-availability vrrp group 10 priority '90'
set high-availability vrrp group 10 virtual-address 10.10.40.20/24
set high-availability vrrp group 10 vrid '10'
set high-availability vrrp group 20 interface 'eth3'
set high-availability vrrp group 20 priority '90'
set high-availability vrrp group 20 virtual-address 192.168.41.20/24
set high-availability vrrp group 20 vrid '20'
set high-availability vrrp sync-group UP member '10'
set high-availability vrrp sync-group UP member '20'

Apachez · September 13, 2023, 6:58pm

To me its a bit odd to mix interfaces like you do.

I would prefer to use eth0-4 on both boxes and that eth0 on box1 is paired with eth0 on box2 and so on.

It will be so much easier to troubleshoot in future

blason · September 14, 2023, 2:27am

Oh that’s fine and it will be taken care. But wondering if someone from community can replicate the setup at their end and confirm?

Apachez · September 14, 2023, 10:38am

I dont find myself having enough of experience from VRRP to be able to answer for all the details (such as if rfc3768-compatibility should be enabled or not) but below is the baseline I would use.

Example:

MGMT: eth0 (R1: 192.168.0.1/24, R2: 192.168.0.2/24)
SYNC: eth1 (R1: 192.168.1.1/24, R2: 192.168.1.2/24)
WAN:  eth2 (R1: 192.168.2.1/24, R2: 192.168.2.2/24)
DMZ:  eth3 (R1: 192.168.3.1/24, R2: 192.168.3.2/24)
LAN:  eth4 (R1: 192.168.4.1/24, R2: 192.168.4.2/24)

R1 have prio 200 (higher) and R2 have prio 100 (lower).

Below config is for R1:

set high-availability vrrp global-parameters version '3'

set high-availability vrrp group WAN address 10.100.102.254/24 interface 'eth2'
set high-availability vrrp group WAN authentication password 'CHANGEME'
set high-availability vrrp group WAN authentication type 'plaintext-password'
set high-availability vrrp group WAN hello-source-address '192.168.1.1'
set high-availability vrrp group WAN interface 'eth1'
set high-availability vrrp group WAN peer-address '192.168.1.2'
set high-availability vrrp group WAN priority '200'
set high-availability vrrp group WAN vrid '101'

set high-availability vrrp group DMZ address 10.100.103.254/24 interface 'eth3'
set high-availability vrrp group DMZ authentication password 'CHANGEME'
set high-availability vrrp group DMZ authentication type 'plaintext-password'
set high-availability vrrp group DMZ hello-source-address '192.168.1.1'
set high-availability vrrp group DMZ interface 'eth1'
set high-availability vrrp group DMZ peer-address '192.168.1.2'
set high-availability vrrp group DMZ priority '200'
set high-availability vrrp group DMZ vrid '102'

set high-availability vrrp group LAN address 10.100.104.254/24 interface 'eth4'
set high-availability vrrp group LAN authentication password 'CHANGEME'
set high-availability vrrp group LAN authentication type 'plaintext-password'
set high-availability vrrp group LAN hello-source-address '192.168.1.1'
set high-availability vrrp group LAN interface 'eth1'
set high-availability vrrp group LAN peer-address '192.168.1.2'
set high-availability vrrp group LAN priority '200'
set high-availability vrrp group LAN vrid '103'

set high-availability vrrp sync-group VYOS member 'WAN'
set high-availability vrrp sync-group VYOS member 'DMZ'
set high-availability vrrp sync-group VYOS member 'LAN'

Use VRRPv3 instead of VRRPv2, VRRPv2 vs VRRPv3 » Network Interview
Use dedicated SYNC-interface where all VRRP-traffic goes through, can be shared with conntrack-sync (which you probably want aswell). If you cant spare a dedicated SYNC interface (or a LAG) then push the VRRP sync over MGMT. Avoid pushing this where upstream/downstream hosts can reach the VRRP traffic.
Use authentication because why not?
Use hello-source-address and peer-address which will make VRRP sync use unicast instead of multicast (multicast sent of a L2-switch who doesnt have IGMP-snooping/IGMP-proxy running (either disabled or an unmanaged switch) means those packets will be handled as broadcast as in sent to all interfaces who are part of the same VLAN).
Enable a sync-group with all VRRP groups so that failover occurs for all interfaces at once. Otherwise you can end up in a situation where R1 is active for WAN but R2 is active for LAN. Drawback with sync-group (if I recall it correctly) is that if WAN on R1 has failed and LAN on R2 has failed then the VRRP wont failover (no idea to failover if the partner doesnt have 100% availability and can reach all networks).

Bonus1: Dont forget to go through how to properly configure track and health-check. For example if interface goes down on R1:LAN you might want to failover. Or for the case where R1:LAN is up but cannot reach gateway but R2:LAN can then its time to perform a failover.

Bonus2: By default preemption is enabled. This means that the router with highest priority will always be active (until VRRP decides it shouldnt). This means if R1 failover to R2 for whatever reason and then recovers then R2 will failback to R1. With preemption disabled (my recommendation) then after failover to R2 the traffic will remain at R2 until VRRP decides that R2 is no good (and failover to R1).

Ref:

https://docs.vyos.io/en/latest/configuration/highavailability/index.html

blason · September 15, 2023, 2:16am

That is fantastic explanation and really appreciate that. So I can pretty much use phyiscal IP on separate network while VRRP can be on another one?

EDIT: By the way what is the version you used?1.3.2 or 1.4?

Apachez · September 15, 2023, 2:53am

Yes, that is my impression that it should work fine to use different IP/MASK for the physical interfaces vs the VIP. For example R1: 192.168.0.1/24, R2: 192.168.0.2/24 and then the VIP: 10.0.0.1/24.

However you normally dont do that because you have a linknet between two sets of devices lets say firewall and router so you put the VIP within the same range as the physical interface, like so (example 169.254.1.0/29):

Downstream VIP:     169.254.1.1/29
Downstream Device1: 169.254.1.2/29
Downstream Device2: 169.254.1.3/29
Upstream Device2:   169.254.1.4/29
Upstream Device1:   169.254.1.5/29
Upstream VIP:       169.254.1.6/29

Where the design is the downstream devices have the lowest IP-addresses and the upstream devices have the highest. So in the above example a downstream device having a lets say default route it would point to 169.254.1.6 as nexthop. While the upstream device pointing out whatever network the downstream is having behind it would use 169.254.1.1 as nexthop.

The upstream/downstream hosts should normally only interact with the VIP address which gives that the physical addresses can very well be linklocal that is 169.254.1.0 - 169.254.254.255 (169.254.0.0/24 and 169.254.255.0/24 are reserved in RFC3927).

For example in my previous example they could be (dont forget to change hello-source-addressand peer-address if changing the physical IP addresses):

MGMT: eth0 (R1: 169.254.100.1/24, R2: 169.254.100.2/24)
SYNC: eth1 (R1: 169.254.101.1/24, R2: 169.254.101.2/24)
WAN:  eth2 (R1: 169.254.102.1/24, R2: 169.254.102.2/24)
DMZ:  eth3 (R1: 169.254.103.1/24, R2: 169.254.103.2/24)
LAN:  eth4 (R1: 169.254.104.1/24, R2: 169.254.104.2/24)

And then you use whatever IP-addresses you like as VIP in each VRRP-group.

I have been using the latest 1.5-rolling currently VyOS 1.5-rolling-202309130022. But the syntax should be same for 1.4-rolling that is from this year (or newer).

But when it comes to rolling I would recommend always use the latest (if you are on the rolling train) due to all the optimizations and fixes thats made to that. Or call it a day and just compile 1.3.3 yourself but then some new features and fixes will not exist in that.