VRRP HA setup without physical IP adresses

I’ve been reading up on high availabililty, I’m experienced with linux clusters, and HSRP in the past, but VRRP, and its VyOS implementation, is new.

I have a setup with two VyOS instances. Both have one interface ISP facing (connecting to the ISP’s layer 2), a config/state sync interface (cross-cable directly between the two instances, and a list of VLANs on a several 10G interfaces LAN facing.

As I only need to connect to these instances via the out-of-band management interface, I would like to only configure that interface with a physical IP address (and the cross-link so state/config can be synced), but don’t configure all other interfaces with anything but the VIP address.

The docs state “Every VRRP router has a physical IP/IPv6 address, and a virtual address.” but I don’t want all those redundant physical IP addresses.

Does VRRP support that?

If I understand the RFC correctly, it uses multicast for every interface VRRP is active on to check state, which means the answer is no, it needs a physical IP on every interface participating.

Which means I have to think about a plan B, I have 4 subnets on the ISP side, and I don’t want to lose 2 IP’s per subnet, even if I would have those available (which is not the case).

The current setup, a Sophos UTM cluster, does not have any active interfaces on the standby none, apart from the interlink/sync interface between the two nodes.

Am I correct in assuming that VRRP health-check scripts only run on the master? And if not, what is the best way to determine is the router is MASTER or BACKUP?

I dunno about the details of VRRP runned through keepalived (or whatever is being used in VyOS) but for example carpd (from FreeBSD world) have this thing that both primary and secondary device will monitor connectivity and it will only failover if the other box have 100% connectivity.

Like if you got 10 interfaces configured and on box1 then int5 goes down and on box2 then int7 goes down then it wont failover.

However carp works by just copying the single IP so there is no need for physical IP-addresses except for the mgmt-interface (or whatever you prefer) where the config will be exchanged. I think its like linkstate thats being monitored for the regular interfaces and for the mgmt-interfaces (who get one IP each) you can have healthscript running that like pings default gateway on the mgmt-network or such to figure out which box have proper connectivity and if its a viable option to failover onto (because again it wont failover if the partner does not have 100% connectivity on the monitored interfaces).

I get it that VRRP requires IP’s on an interface to be able to multicast the state of that specific interface, and I understand that it won’t failover if there is a failure on an interfae that isn’t monitored.

But i’m not that worried about that, as these are all VMWare ESX instances, so the network availability is done at the ESX level.

The main issue I want to guard against is a failure of the VM, or any component of the ESX server the VM runs on. Also, it takes about 3 minutes after a (re)boot for everything to be back online, so I also want HA to be able to do upgrades without downtime (by a forced fail-over to the other node).

Just to be sure I understand: You want to have one or more VIPs on an interface that does not have an ip address?

Not sure if VyOS can…but Mikrotik can use IPv6 link-local addresses for handling the IPv4 VRRP group. No more wasted address space, this way VRRP can even handle /31 subnet

Yes, or in other words: the only IP address I want on an interface is the VIP address. Except for the management interface (so I can access each instance) and the sync interface.

There is no need for IP addresses on any of the remaining inferfaces of the standby node,

In the current setup, with Sophos UTM, only the sync interface has an IP, and doesn’t have a VIP. All interfaces of the standby node don’t have an IP, you need to access the standby node from the active node by ssh the sync interface IP.

Because of this, no IP’s needed to be reserved, and because of that, I don’t have any spare on the public side.

Sorry for the slow reply, but, yes you can do that.

set high-availability vrrp group vrrp-group interface ‘eth1’
set high-availability vrrp group vrrp-group address 192.0.2.1/32 interface eth0
set high-availability vrrp group vrrp-group preempt-delay ‘300’
set high-availability vrrp group vrrp-group priority ‘100’
set high-availability vrrp group vrrp-group vrid ‘12

With this VRRP will run on eth1 but assign the IP 192.0.2.1 to eth0. I’m not sure if you can use rfc3768-compatibility with this. I do not have this setup running anywhere at the moment. I thought I did…

The only problem here is that if eth0 goes down for some reason, VRRP will not fail over.

Thanks. I’ll do some testing in a virtual setup once I’ve finished all migrations, which has top prio at the moment.

I can live with that restriction, all interfaces are virtual, the only possible reason for one going down is a change to the virtual machine definition (interface to port group mapping) or with the port group definition itself. The physical layer is handled by VMware, and is fully redundant.

The main problems I want to guard against are crash of the VM, and crash of the entire ESX server. And I want to be able to manually trigger a failover, so I can upgrade without downtime.

If I interpret the xml.in correctly, then this isn’t possible?

set high-availability vrrp group vrrp-group address 192.0.2.1/32 interface eth0
set high-availability vrrp group vrrp-group address 192.0.2.2/32 interface eth0
set high-availability vrrp group vrrp-group address 192.0.2.3/32 interface eth0

i.e. multiple IP’s on an interface?

I’m also confused by the remark that you can’t have both IPv4 and IPv6 adresses, or more than 20 IPv4 addresses, but have to use exclude-address instead?

Our setup is completely dual stack, so I have an eth0 with dozens of IPv4 and IPv6 adresses which all need to fail over.

You can add multiple IP’s on an interface just like you did in your example.

You cannot have IPv4 and IPv6 in the same VRRP group. This is because a VRRP group is either VRRPv2 or VRRPv3. Keepalived, the software doing VRRP, does v2 for IPv4 and v3 for IPv6. You can just run 2 VRRP groups on the same interfaces. Like:

set high-availability vrrp group vrrp-group-v4 address 192.0.2.1/32 interface eth0
set high-availability vrrp group vrrp-group-v4 address 192.0.2.2/32 interface eth0
set high-availability vrrp group vrrp-group-v4 address 192.0.2.3/32 interface eth0
set high-availability vrrp group vrrp-group-v6 address 2001:db8::1/128 interface eth0
set high-availability vrrp group vrrp-group-v6 address 2001:db8::2/128 interface eth0
set high-availability vrrp group vrrp-group-v6 address 2001:db8::3/128 interface eth0

Now, you can also use just one group. Like:

set high-availability vrrp group vrrp-group address 192.0.2.1/32 interface eth0
set high-availability vrrp group vrrp-group exclude-address 192.0.2.2/32 interface eth0
set high-availability vrrp group vrrp-group exclude-address 192.0.2.3/32 interface eth0
set high-availability vrrp group vrrp-group exclude-address 2001:db8::1/128 interface eth0
set high-availability vrrp group vrrp-group exclude-address 2001:db8::2/128 interface eth0
set high-availability vrrp group vrrp-group exclude-address 2001:db8::3/128 interface eth0

If you have a lot of address in VRRP the second example might even be a better solution. VRRP packets will stay small and require less processing.

I’ve read that in the docs, but I don’t get what it does.

I have 4 subnets on eth0, two /28’s and two /29’s. Most of the addresses are used, but not all (for example the provider runs HSRP on their end to provide the default gateway, so I already loose 3 (both routers + VIP) in that subnet).

The idea is that I have none of them on the interface, but have VRRP add them on the active node.

So using your example, say from the subnet 192.0.2.0/29 I have in use 192.0.2.1/29, 192.0.2.2/29, 192.0.2.4/29 and 192.0.2.6/29. What would the config look like? What do I include and exclude, and how would VRRP know what to include if I only specify excludes?

Ah, I had to read this a couple of times…

The exclude-address doesn’t mean it is excluded from use, it is excluded from the VRRP process itself. Normally VRRP sends out MCAST packets with all the VRRP addresses in there. If you have a lot of addresses this becomes big and uses resources.

Now, in stead of making them part of the VRRP process, you can add them as exlude-address. They will still be added to the interfaces and moved to the other router, but they are outside of the VRRP packet processing. Because the IP’s are kept outside VRRP processing you can also add IPv6 in the mix.What happens is keepalived just runs ip address add $ip dev $dev for every ip that is in the excluded-address.

It’s a feature which only Keepalived has and AFAIK is not part of any RFC. So, if you want to talk VRRP with a Juniper, you cannot do this. But if you have to machines running Keepalived, you can use this.

I think sometimes it’s easier to use it. Since adding and removing an IP from VRRP becomes a non-problem. If you add all IPs as address and you add or remove one, VRRP will start to compain the addresses are not the same on both machines and one, or both, VRRP instances will go to FAILED state. If you use excluded-address this will not happen because the IPs will not be send.

Your config could be:

set high-availability vrrp group vrrp-group address 192.0.2.1/29 interface eth0
set high-availability vrrp group vrrp-group exclude-address 192.0.2.2/29 interface eth0
set high-availability vrrp group vrrp-group exclude-address 192.0.2.4/29 interface eth0
set high-availability vrrp group vrrp-group exclude-address 192.0.2.6/29 interface eth0

Or, because you do not throw IPv6 in the mix:

set high-availability vrrp group vrrp-group address 192.0.2.1/29 interface eth0
set high-availability vrrp group vrrp-group address 192.0.2.2/29 interface eth0
set high-availability vrrp group vrrp-group address 192.0.2.4/29 interface eth0
set high-availability vrrp group vrrp-group address 192.0.2.6/29 interface eth0

Hope I’m being clear. If not, just keep asking :wink:

Thanks, very much appreciated !

I’ve been reading your reply for a while now, but it still doesn’t make much sense.

I’m I correct in saying that from a fail-over point of view, address and excluded-address do exactly the same, the only difference is the address isn’t part of the VRRP chat between the two nodes?

So in the example you’ve given, when the node fails over, eth0 will be assigned all 4 IP adresses?

p.s. I may not have mentioned IPv6 explicitly, but we’re fully dual stack, so every IPv4 has a corresponding IPv6 address. So in total eth0 has 30+ IP addresses that need to fail over.

I’m I correct in saying that from a fail-over point of view, address and excluded-address do exactly the same, the only difference is the address isn’t part of the VRRP chat between the two nodes?

Yes you are.It’s exactly that.

So in the example you’ve given, when the node fails over, eth0 will be assigned all 4 IP adresses?

Yes

p.s. I may not have mentioned IPv6 explicitly, but we’re fully dual stack, so every IPv4 has a corresponding IPv6 address. So in total eth0 has 30+ IP addresses that need to fail over.

You, so you put 1 IPv4 address in the address section and all other IPv4 and IPv6 address in the address-excluded section OR you place 1 IPv6 address in the address section and all others in the address-excluded

To be complete the Keepalived docs say:

    # VRRP IP excluded from VRRP optional.
    # For cases with large numbers (eg 200) of IPs
    # on the same interface. To decrease the number
    # of addresses sent in adverts, you can exclude
    # most IPs from adverts.
    # The IPs are add|del as for virtual_ipaddress.
    # Can also be used if you want to be able to add
    # a mixture of IPv4 and IPv6 addresses, since all
    # addresses in virtual_ipaddress must be of the
    # same family.

And of course, you can still run 2 instances of VRRP. One with IPv4 and one with IPv6, like in my first example. Works just as well.

Brillant, thanks so much for your help.