Could there be a bug with Wan load balancing? Failed conn but still showing active not failed

Hi All,

I have a possible issue, although it could be incorrect configuration.

I’m trying to failover from main internet connection at eth0 to eth1.8.

I test this by shutting down the port on the upstream mikrotik switch on eth0 and I see the tests fail but it never goes into a failed state and so doesn’t re-route the traffic via eth1.8 which is active and passing.

Version:          VyOS 1.5-rolling-202311160736
Release train:    current

This is my configuration:

vyos@vyo1-zima1.lsk1# show | commands
set wan disable-source-nat
set wan enable-local-traffic
set wan flush-connections
set wan interface-health eth0 failure-count '5'
set wan interface-health eth0 nexthop 'dhcp'
set wan interface-health eth0 success-count '2'
set wan interface-health eth0 test 10 target '1.1.2.2'
set wan interface-health eth0 test 11 target '8.8.8.8'
set wan interface-health eth1.8 nexthop '192.168.78.1'
set wan rule 2 destination address '10.0.0.0/8'
set wan rule 2 exclude
set wan rule 2 inbound-interface 'eth1'
set wan rule 2 protocol 'all'
set wan rule 9997 inbound-interface 'eth1'
set wan rule 9997 interface eth0
set wan rule 9998 failover
set wan rule 9998 inbound-interface 'eth1'
set wan rule 9998 interface eth1.8
set wan sticky-connections inbound

and you can see the result is as follows during a failover scenario:

vyos@vyo1-zima1.lsk1# run show wan-load-balance
Interface:  eth0
  Status:  active
  Last Status Change:  Sun Nov 19 10:23:03 2023
  -Test:  ping  Target: 1.1.2.2
  -Test:  ping  Target: 8.8.8.8
    Last Interface Success:  17s
    Last Interface Failure:  0s
    # Interface Failure(s):  1

Interface:  eth1.8
  Status:  active
  Last Status Change:  Sun Nov 19 10:23:03 2023
  +Test:  ping  Target: 192.168.78.1
    Last Interface Success:  6s
    Last Interface Failure:  n/a
    # Interface Failure(s):  0

under standard operation - both conns up:

vyos@vyo1-zima1.lsk1# run show wan-load-balance
Interface:  eth0
  Status:  active
  Last Status Change:  Sun Nov 19 10:23:03 2023
  -Test:  ping  Target: 1.1.2.2
  +Test:  ping  Target: 8.8.8.8
    Last Interface Success:  0s
    Last Interface Failure:  1h53m51s
    # Interface Failure(s):  0

Interface:  eth1.8
  Status:  active
  Last Status Change:  Sun Nov 19 10:23:03 2023
  +Test:  ping  Target: 192.168.78.1
    Last Interface Success:  0s
    Last Interface Failure:  n/a
    # Interface Failure(s):  0

Any ideas or help appreciated.

Thanks!
Jon.

Ignore the above, it turns out I wasn’t waiting long enough for the failure count to reach 5!!

It has now failed over sucessfully.

Cheers,
Jon.