Unfortunately my ISP has given me a /29 subnet and I only now have 1 IP address left from that pool. I wonder if I can configure cluster or VRRP where physical IPs can be on different subnet and VIP is on other?
1: Ask your ISP if they cannot provide a linklocal or RFC1918 as linknet for routing and route that full /29 towards your end of that linknet. Because then you can use all 8 IP-addresses (of this /29 range) for services and not just 6 of them (since having that /29 onlink will waste the first IP as net-address and last IP as broadcast-address).
For example:
ISP: 169.254.1.1/30
YOU: 169.254.1.2/30
You route default (0.0.0.0/0) to 169.254.1.1.
ISP route x.x.x.x/29 to 169.254.1.2.
2: I think this should work just fine.
Arista for example supports this where if you dont define a subnetmask for the VIP it will use the same as the physical interfaces.
What happens in the background for that 2nd case (which you ask about) is that a selective proxy-arp will be used so when someone on that VLAN “arp-whohas 10.0.0.1?” it will reply with the VIP mac-address (either a virtual one or the same as the physical interface depending on your settings - either way a GARP will be sent when passive box becomes active).
Another option in your case instead of using VRRP/HA is to do a simple DNAT. That will however only forward traffic to a single host at your DMZ/LAN.
Something like:
Just incoming DNAT:
set nat destination rule 10 destination address '20.20.20.20'
set nat destination rule 10 inbound-interface 'eth1'
set nat destination rule 10 translation address '192.168.0.1'
Or a 1:1 NAT (that is the device on DMZ/LAN will use this 20.20.20.20 as SNAT aswell for outbound traffic):
set nat static rule 20 destination address '20.20.20.20'
set nat static rule 20 inbound-interface 'eth1'
set nat static rule 20 translation address '192.168.0.1'
set high-availability virtual-server 2.2.2.2 address '2.2.2.2'
set high-availability virtual-server 2.2.2.2 algorithm 'source-hashing'
set high-availability virtual-server 2.2.2.2 delay-loop '10'
set high-availability virtual-server 2.2.2.2 forward-method 'direct'
set high-availability virtual-server 2.2.2.2 persistence-timeout '3600'
set high-availability virtual-server 2.2.2.2 port '8080'
set high-availability virtual-server 2.2.2.2 protocol 'tcp'
set high-availability virtual-server 2.2.2.2 real-server 192.168.1.101 port '80'
set high-availability virtual-server 2.2.2.2 real-server 192.168.1.102 port '80'
The above should bring you that stuff going for 2.2.2.2:8080 is loadbalanced towards TCP80 on 192.168.1.101 and 192.168.1.102.
Direct-method because you want to keep the source-ip when hitting the internal server.
If the virtual server is within the range of the interface but not the interface ip itself you probably need to enable proxy-arp, local proxy-arp or selective proxy-arp depending on your choice.
By the way how do I failover the entire vyos instahce in case of interface failure? Like in my below scenario
R1
eth0 : 10.1.1.2
eth1 : 10.30.30.30.2
R2
eth0 : 10.1.1.3
eth1 : 10.30.30.30.3
VIP
eth0 : 10.1.1.1
eth1 : 10.30.30.30.1
VRRP Group 10
VRRP Group 20
So incase of eth0 failed only eth0 from R2 shows as a master while eth1 stays as a master on R1 hence my connectivity does not work. Am I missing anything?
Also using some kind of health script is probably wanted too (for example if VYOS2 can reach the upstream gateway but VYOS1 cannot its time to failover to VYOS2):
Well - it worked in my scenario directly without much hassle. Can I really go into production with this setup? Can anyone please confirm or was that really a fluke?
R1
eth3 and eth4 are configured as VRRP
Interface IP Address S/L Description
--------- ---------- --- -----------
eth0 192.168.5.16/24 u/u
eth1 10.10.10.16/24 u/u
eth2 10.10.20.16/24 u/u
eth3 169.254.10.1/24 u/u
10.10.40.20/24
eth4 192.168.41.17/24 u/u
192.168.41.20/24
R2
vyos@R2# run show interfaces
Codes: S - State, L - Link, u - Up, D - Down, A - Admin Down
Interface IP Address S/L Description
--------- ---------- --- -----------
eth0 10.10.10.17/24 u/u
eth1 10.10.20.17/24 u/u
eth2 169.254.10.2/24 u/u
eth3 192.168.41.16/24 u/u
My eth3 and eth2 on R1 and R2 are configured with link-local and here is VRRP config
R1
vyos@R1# run show vrrp
Name Interface VRID State Priority Last Transition
------ ----------- ------ ------- ---------- -----------------
10 eth3 10 MASTER 100 3m21s
20 eth4 20 MASTER 100 3m21s
R2
vyos@R2# run show vrrp
Name Interface VRID State Priority Last Transition
------ ----------- ------ ------- ---------- -----------------
10 eth2 10 BACKUP 90 3m38s
20 eth3 20 BACKUP 90 3m38s
Configuration commands
R1
set high-availability vrrp group 10 interface 'eth3'
set high-availability vrrp group 10 priority '100'
set high-availability vrrp group 10 virtual-address 10.10.40.20/24
set high-availability vrrp group 10 vrid '10'
set high-availability vrrp group 20 interface 'eth4'
set high-availability vrrp group 20 priority '100'
set high-availability vrrp group 20 virtual-address 192.168.41.20/24
set high-availability vrrp group 20 vrid '20'
set high-availability vrrp sync-group UP member '10'
set high-availability vrrp sync-group UP member '20'
R2
set high-availability vrrp group 10 interface 'eth2'
set high-availability vrrp group 10 priority '90'
set high-availability vrrp group 10 virtual-address 10.10.40.20/24
set high-availability vrrp group 10 vrid '10'
set high-availability vrrp group 20 interface 'eth3'
set high-availability vrrp group 20 priority '90'
set high-availability vrrp group 20 virtual-address 192.168.41.20/24
set high-availability vrrp group 20 vrid '20'
set high-availability vrrp sync-group UP member '10'
set high-availability vrrp sync-group UP member '20'
I dont find myself having enough of experience from VRRP to be able to answer for all the details (such as if rfc3768-compatibility should be enabled or not) but below is the baseline I would use.
R1 have prio 200 (higher) and R2 have prio 100 (lower).
Below config is for R1:
set high-availability vrrp global-parameters version '3'
set high-availability vrrp group WAN address 10.100.102.254/24 interface 'eth2'
set high-availability vrrp group WAN authentication password 'CHANGEME'
set high-availability vrrp group WAN authentication type 'plaintext-password'
set high-availability vrrp group WAN hello-source-address '192.168.1.1'
set high-availability vrrp group WAN interface 'eth1'
set high-availability vrrp group WAN peer-address '192.168.1.2'
set high-availability vrrp group WAN priority '200'
set high-availability vrrp group WAN vrid '101'
set high-availability vrrp group DMZ address 10.100.103.254/24 interface 'eth3'
set high-availability vrrp group DMZ authentication password 'CHANGEME'
set high-availability vrrp group DMZ authentication type 'plaintext-password'
set high-availability vrrp group DMZ hello-source-address '192.168.1.1'
set high-availability vrrp group DMZ interface 'eth1'
set high-availability vrrp group DMZ peer-address '192.168.1.2'
set high-availability vrrp group DMZ priority '200'
set high-availability vrrp group DMZ vrid '102'
set high-availability vrrp group LAN address 10.100.104.254/24 interface 'eth4'
set high-availability vrrp group LAN authentication password 'CHANGEME'
set high-availability vrrp group LAN authentication type 'plaintext-password'
set high-availability vrrp group LAN hello-source-address '192.168.1.1'
set high-availability vrrp group LAN interface 'eth1'
set high-availability vrrp group LAN peer-address '192.168.1.2'
set high-availability vrrp group LAN priority '200'
set high-availability vrrp group LAN vrid '103'
set high-availability vrrp sync-group VYOS member 'WAN'
set high-availability vrrp sync-group VYOS member 'DMZ'
set high-availability vrrp sync-group VYOS member 'LAN'
Use dedicated SYNC-interface where all VRRP-traffic goes through, can be shared with conntrack-sync (which you probably want aswell). If you cant spare a dedicated SYNC interface (or a LAG) then push the VRRP sync over MGMT. Avoid pushing this where upstream/downstream hosts can reach the VRRP traffic.
Use authentication because why not?
Use hello-source-address and peer-address which will make VRRP sync use unicast instead of multicast (multicast sent of a L2-switch who doesnt have IGMP-snooping/IGMP-proxy running (either disabled or an unmanaged switch) means those packets will be handled as broadcast as in sent to all interfaces who are part of the same VLAN).
Enable a sync-group with all VRRP groups so that failover occurs for all interfaces at once. Otherwise you can end up in a situation where R1 is active for WAN but R2 is active for LAN. Drawback with sync-group (if I recall it correctly) is that if WAN on R1 has failed and LAN on R2 has failed then the VRRP wont failover (no idea to failover if the partner doesnt have 100% availability and can reach all networks).
Bonus1: Dont forget to go through how to properly configure track and health-check. For example if interface goes down on R1:LAN you might want to failover. Or for the case where R1:LAN is up but cannot reach gateway but R2:LAN can then its time to perform a failover.
Bonus2: By default preemption is enabled. This means that the router with highest priority will always be active (until VRRP decides it shouldnt). This means if R1 failover to R2 for whatever reason and then recovers then R2 will failback to R1. With preemption disabled (my recommendation) then after failover to R2 the traffic will remain at R2 until VRRP decides that R2 is no good (and failover to R1).
Yes, that is my impression that it should work fine to use different IP/MASK for the physical interfaces vs the VIP. For example R1: 192.168.0.1/24, R2: 192.168.0.2/24 and then the VIP: 10.0.0.1/24.
However you normally dont do that because you have a linknet between two sets of devices lets say firewall and router so you put the VIP within the same range as the physical interface, like so (example 169.254.1.0/29):
Where the design is the downstream devices have the lowest IP-addresses and the upstream devices have the highest. So in the above example a downstream device having a lets say default route it would point to 169.254.1.6 as nexthop. While the upstream device pointing out whatever network the downstream is having behind it would use 169.254.1.1 as nexthop.
The upstream/downstream hosts should normally only interact with the VIP address which gives that the physical addresses can very well be linklocal that is 169.254.1.0 - 169.254.254.255 (169.254.0.0/24 and 169.254.255.0/24 are reserved in RFC3927).
For example in my previous example they could be (dont forget to change hello-source-addressand peer-address if changing the physical IP addresses):
And then you use whatever IP-addresses you like as VIP in each VRRP-group.
I have been using the latest 1.5-rolling currently VyOS 1.5-rolling-202309130022. But the syntax should be same for 1.4-rolling that is from this year (or newer).
But when it comes to rolling I would recommend always use the latest (if you are on the rolling train) due to all the optimizations and fixes thats made to that. Or call it a day and just compile 1.3.3 yourself but then some new features and fixes will not exist in that.