VyOS 1.4 not forwarding over all BGP routes

I have been observing an odd behavior on a VyOS 1.4 installation. I have an external router that is peering via BGP with VyOS. Advertised routes show up in VyOS and traffic is forwarded via these routes appropriately. I have recently added a second external router that is also peering via BGP. Advertised routes from this second router show us well but VyOS is not forwarding any traffic via them. From VyOS, the next hop on the new router can be pinged, just not from anywhere else.

It gets more odd in that if I disable BGP peering between the first router and VyOS, and then re-establish it with the second router, VyOS will start forwarding traffic via the learned routes from the second router. If I reestablish BGP peering with the first router, the behavior observed with routes learned from the second router is now observed with routes learned from the first. I can then reverse this again such that the issue is affecting routes learned from the second router.

It seems as if VyOS is only forwarding traffic over routes learned from whichever BGP peer connects first. There is no route filtering implemented on VyOS, but even if there were, it would not appear to be working consistently. Has anyone seen behavior like this before or have any thoughts on what might be causing this behavior? Thanks.

this route that you mentioned , is it the same prefix ? what version are you using ? we need more information about bgp announcement and prefix . Could you share the following command?

show ip bgp 

show ip bgp ipv4 unicast x.x.x.x/x  (prefix with issues)

show ip route

Thank you for the quick response. The routes do not share the same prefix.

The currently working route is 10.40.14.0/24 via 192.168.140.3 as next hop. The currently not-working route is 10.40.15.0/24 via 192.168.240.3 as next hop.

I was also able to dig into each external router a little bit and I found that whichever one is not being forwarded to by VyOS is also not receiving any routes from VyOS via BGP.

get bgp neighbor summary
BFD States: NC - Not configured, DC - Disconnected
            DW - Down, IN - Init, UP - Up
BGP summary information for VRF default for address-family: ipv4Unicast
Router ID: 192.168.240.3  Local AS: 65240

Neighbor                            AS          State Up/DownTime  BFD InMsgs  OutMsgs InPfx  OutPfx

192.168.240.1                       65002       Estab 00:02:37     NC  1836    1440    0      2

And the same from the currently working router:

get bgp neighbor summary
BFD States: NC - Not configured, DC - Disconnected
            DW - Down, IN - Init, UP - Up
BGP summary information for VRF default for address-family: ipv4Unicast
Router ID: 192.168.140.3  Local AS: 65140

Neighbor                            AS          State Up/DownTime  BFD InMsgs  OutMsgs InPfx  OutPfx

192.168.140.1                       65002       Estab 01:56:24     NC  897     3166    29     2

The following is the requested information from VyOS:

Version:          VyOS 1.4-rolling-202305030317
show ip bgp
BGP table version is 891, local router ID is 192.168.250.1, vrf id 0
Default local pref 100, local AS 65002
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

    Network          Next Hop            Metric LocPrf Weight Path
 *> 10.40.14.0/24    192.168.140.3                          0 65140 ?
 *> 10.40.15.0/24    192.168.240.3                          0 65240 ?
 *> 192.168.0.0/24   0.0.0.0                  0         32768 ?
 *> 192.168.100.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.110.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.111.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.112.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.113.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.120.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.121.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.122.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.123.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.124.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.130.0/24 0.0.0.0                  0         32768 ?
 *  192.168.140.0/24 192.168.140.3            0             0 65140 ?
 *>                  0.0.0.0                  0         32768 ?
 *> 192.168.150.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.200.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.210.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.211.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.212.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.213.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.220.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.221.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.222.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.223.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.224.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.230.0/24 0.0.0.0                  0         32768 ?
 *  192.168.240.0/24 192.168.240.3            0             0 65240 ?
 *>                  0.0.0.0                  0         32768 ?
 *> 192.168.250.0/24 0.0.0.0                  0         32768 ?

Displayed  29 routes and 31 total paths
show ip bgp ipv4 unicast 10.40.15.0/24
BGP routing table entry for 10.40.15.0/24, version 891
Paths: (1 available, best #1, table default)
  Advertised to non peer-group peers:
  192.168.140.3 192.168.240.3
  65240, (aggregated by 65240 192.168.240.3)
    192.168.240.3 from 192.168.240.3 (192.168.240.3)
      Origin incomplete, valid, external, atomic-aggregate, best (First path received)
      Last update: Mon Jul 24 14:58:19 2023
show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

S>* 0.0.0.0/0 [1/0] via 192.168.0.1, eth0, weight 1, 20:39:38
B>* 10.40.14.0/24 [20/0] via 192.168.140.3, eth1.140, weight 1, 01:59:39
B>* 10.40.15.0/24 [20/0] via 192.168.240.3, eth1.240, weight 1, 00:00:03
C>* 192.168.0.0/24 is directly connected, eth0, 20:40:00
C>* 192.168.100.0/24 is directly connected, eth1.100, 20:39:55
C>* 192.168.110.0/24 is directly connected, eth1.110, 20:39:55
C>* 192.168.111.0/24 is directly connected, eth1.111, 20:39:55
C>* 192.168.112.0/24 is directly connected, eth1.112, 20:39:54
C>* 192.168.113.0/24 is directly connected, eth1.113, 20:39:54
C>* 192.168.120.0/24 is directly connected, eth1.120, 20:39:54
C>* 192.168.121.0/24 is directly connected, eth1.121, 20:39:53
C>* 192.168.122.0/24 is directly connected, eth1.122, 20:39:53
C>* 192.168.123.0/24 is directly connected, eth1.123, 20:39:52
C>* 192.168.124.0/24 is directly connected, eth1.124, 20:39:52
C>* 192.168.130.0/24 is directly connected, eth1.130, 20:39:51
C>* 192.168.140.0/24 is directly connected, eth1.140, 20:39:51
C>* 192.168.150.0/24 is directly connected, eth1.150, 20:39:51
C>* 192.168.200.0/24 is directly connected, eth1.200, 20:39:50
C>* 192.168.210.0/24 is directly connected, eth1.210, 20:39:50
C>* 192.168.211.0/24 is directly connected, eth1.211, 20:39:49
C>* 192.168.212.0/24 is directly connected, eth1.212, 20:39:49
C>* 192.168.213.0/24 is directly connected, eth1.213, 20:39:49
C>* 192.168.220.0/24 is directly connected, eth1.220, 20:39:48
C>* 192.168.221.0/24 is directly connected, eth1.221, 20:39:48
C>* 192.168.222.0/24 is directly connected, eth1.222, 20:39:48
C>* 192.168.223.0/24 is directly connected, eth1.223, 20:39:47
C>* 192.168.224.0/24 is directly connected, eth1.224, 20:39:47
C>* 192.168.230.0/24 is directly connected, eth1.230, 20:39:46
C>* 192.168.240.0/24 is directly connected, eth1.240, 20:39:46
C>* 192.168.250.0/24 is directly connected, eth1.250, 20:39:45

It looks like the route is advertised to both peers.

  Advertised to non peer-group peers:
  192.168.140.3 192.168.240.3

Does the router thats not showing any routes have a “show bgp ipv4 neighbors 192.168.240.1 received-routes” equivalent command you can run to see if its receiving the routes but not accepting them?

Yes, something similar, and it looks like it is not receiving any routes:

get bgp neighbor 192.168.240.1 routes
BGP IPv4 table version is 439
Local router ID is 192.168.240.3
Status flags: > - best, I - internal
Origin flags: i - IGP, e - EGP, ? - incomplete

   Network                             Next Hop                            Metric       LocPrf  Weight   Path

Mon Jul 24 2023 UTC 15:56:58.301

vs. what is seen on the functional router:

get bgp neighbor 192.168.140.1 routes
BGP IPv4 table version is 91
Local router ID is 192.168.140.3
Status flags: > - best, I - internal
Origin flags: i - IGP, e - EGP, ? - incomplete

   Network                             Next Hop                            Metric       LocPrf  Weight   Path
 > 0.0.0.0/0                           192.168.140.1                       0            100     0        65002 i
 > 10.40.15.0/24                       192.168.140.1                       0            100     0        65002 65240 ?
 > 192.168.0.0/24                      192.168.140.1                       0            100     0        65002 ?
 > 192.168.100.0/24                    192.168.140.1                       0            100     0        65002 ?
 > 192.168.110.0/24                    192.168.140.1                       0            100     0        65002 ?
 > 192.168.111.0/24                    192.168.140.1                       0            100     0        65002 ?
 > 192.168.112.0/24                    192.168.140.1                       0            100     0        65002 ?

(output truncated)

What OS are the other routers running?

These are NSX-T 4.1 edges

Could you run a “show bgp ipv4 neighbors XXX.XXX.XXX.XXX advertised-routes” where the IP is of the bgp peer thats not seeing any of the routes?

Sure, here you go…and it looks like it “thinks” it’s advertising routes at least.

show bgp ipv4 neighbors 192.168.240.3 advertised-routes
BGP table version is 41, local router ID is 192.168.250.1, vrf id 0
Default local pref 100, local AS 65002
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

Originating default network 0.0.0.0/0

    Network          Next Hop            Metric LocPrf Weight Path
 *> 10.40.14.0/24    0.0.0.0                                0 65140 ?
 *> 10.40.15.0/24    0.0.0.0                                0 65240 ?
 *> 192.168.0.0/24   0.0.0.0                  0         32768 ?
 *> 192.168.100.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.110.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.111.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.112.0/24 0.0.0.0                  0         32768 ?
 *> 192.168.113.0/24 0.0.0.0                  0         32768 ?

(output truncated)

Heres my speculation for the moment. Just to explain my line of thinking, That shows that it is sending the routes to the peer. In your second post you showed the two outputs below. What i found interesting was the session on the router not seeing the routes had a large amount of InMsgs even though it had only been up for a couple minutes. This combined with the show advertised routes leads me to believe that it is being sent the routes. Im not at all familiar with VMware NSX-T but looking at their documentation “get bgp neighbor 192.168.240.1 routes” appears to be the equivalent of “show bgp ipv4 neighbors 192.168.240.1 routes” vs “show bgp ipv4 neighbors 192.168.240.1 received-routes”. the difference being is routes only shows routes that were accepted vs received-routes showing everything, accepted or not. I don’t quite see an equivalent in their docs to “received-routes”. Most commonly routes are not accepted if the next-hop for the routes is not in the table of the receiving router. You could try override the nexthop on the session to see if that makes a difference. You also mentioned that when you disable the functioning session the non functioning one starts working, does the nexthop change on the advertised routes when you do that?

get bgp neighbor summary
BFD States: NC - Not configured, DC - Disconnected
            DW - Down, IN - Init, UP - Up
BGP summary information for VRF default for address-family: ipv4Unicast
Router ID: 192.168.240.3  Local AS: 65240

Neighbor                            AS          State Up/DownTime  BFD InMsgs  OutMsgs InPfx  OutPfx

192.168.240.1                       65002       Estab 00:02:37     NC  1836    1440    0      2
get bgp neighbor summary
BFD States: NC - Not configured, DC - Disconnected
            DW - Down, IN - Init, UP - Up
BGP summary information for VRF default for address-family: ipv4Unicast
Router ID: 192.168.140.3  Local AS: 65140

Neighbor                            AS          State Up/DownTime  BFD InMsgs  OutMsgs InPfx  OutPfx

192.168.140.1                       65002       Estab 01:56:24     NC  897     3166    29     2

Thanks for this analysis Kyle. Regarding your question,

You also mentioned that when you disable the functioning session the non functioning one starts working, does the nexthop change on the advertised routes when you do that?

The nexthop does not change in this scenario (or any scenario really). The routes received at the VyOS end are always the same. It looks like it is only at the external router side that the received routes are different based on which external router peered with VyOS first.

Routers can be affected by best-path selection, which, by default, allows them to install the best route. If the router learns multiple paths or alternative routes with lower preference, it will simply add the best one to its RIB. To enhance network performance and flexibility, consider disabling this default behavior or allowing ECMP .

Thanks Fernando. ECMP is enabled but there are not multiple paths to any destination regardless.

I’ve spent a lot more time today digging into this but still don’t feel like I’m closer to figuring it out. I was able to do a packet capture on each NSX edge device and I can see where the routes are coming in from VyOS on the working system but I see no such traffic on the non-working system (I can share these packet captures if it would help). I’ve also enabled debug logging for all bgp-related functions on VyOS but don’t see anything amiss in the messages log…it would appear that VyOS thinks the routes were sent, at least from a bgp perspective.

There are no firewalls involved (disabled at all possible locations) and nothing is between VyOS and either of the NSX edge devices.

A little interesting but I’m not sure what to make of it yet.

Packet capture on VyOS over the interface used for the “good” external router, showing when advertised routes are sent and the ack from the external router:

22:15:43.892366 IP 192.168.140.1.bgp > 192.168.140.3.39283: Flags [P.], seq 116:1742, ack 245, win 969, options [nop,nop,TS val 4089993107 ecr 325682161], length 1626: BGP
22:15:43.904630 IP 192.168.140.3.39283 > 192.168.140.1.bgp: Flags [.], ack 1742, win 237, options [nop,nop,TS val 325682306 ecr 4089993107], length 0

Similar point in time for the “bad” external router but there is never an ack and then there is some arp activity…

22:13:18.838945 IP 192.168.240.1.bgp > 192.168.240.3.41699: Flags [P.], seq 116:1808, ack 245, win 969, options [nop,nop,TS val 3685676948 ecr 104182102], length 1692: BGP
22:13:19.066114 IP 192.168.240.1.bgp > 192.168.240.3.41699: Flags [P.], seq 116:1808, ack 245, win 969, options [nop,nop,TS val 3685677175 ecr 104182102], length 1692: BGP
22:13:19.290131 IP 192.168.240.1.bgp > 192.168.240.3.41699: Flags [P.], seq 116:1808, ack 245, win 969, options [nop,nop,TS val 3685677399 ecr 104182102], length 1692: BGP
22:13:19.738136 IP 192.168.240.1.bgp > 192.168.240.3.41699: Flags [P.], seq 116:1808, ack 245, win 969, options [nop,nop,TS val 3685677847 ecr 104182102], length 1692: BGP
22:13:20.658158 IP 192.168.240.1.bgp > 192.168.240.3.41699: Flags [P.], seq 116:1808, ack 245, win 969, options [nop,nop,TS val 3685678767 ecr 104182102], length 1692: BGP
22:13:22.450247 IP 192.168.240.1.bgp > 192.168.240.3.41699: Flags [P.], seq 116:1808, ack 245, win 969, options [nop,nop,TS val 3685680559 ecr 104182102], length 1692: BGP
22:13:22.887370 ARP, Request who-has 192.168.240.1 tell 192.168.240.3, length 46
22:13:22.887392 ARP, Reply 192.168.240.1 is-at 00:50:56:2a:c2:9f (oui Unknown), length 28
22:13:26.034139 IP 192.168.240.1.bgp > 192.168.240.3.41699: Flags [P.], seq 116:1808, ack 245, win 969, options [nop,nop,TS val 3685684143 ecr 104182102], length 1692: BGP
22:13:33.202179 IP 192.168.240.1.bgp > 192.168.240.3.41699: Flags [P.], seq 116:1808, ack 245, win 969, options [nop,nop,TS val 3685691311 ecr 104182102], length 1692: BGP
22:13:47.538223 IP 192.168.240.1.bgp > 192.168.240.3.41699: Flags [P.], seq 116:1808, ack 245, win 969, options [nop,nop,TS val 3685705647 ecr 104182102], length 1692: BGP
22:13:52.658125 ARP, Request who-has 192.168.240.3 tell 192.168.240.1, length 28
22:13:52.659017 ARP, Reply 192.168.240.3 is-at 00:50:56:b9:2c:41 (oui Unknown), length 46
22:14:16.210186 IP 192.168.240.1.bgp > 192.168.240.3.41699: Flags [P.], seq 116:1808, ack 245, win 969, options [nop,nop,TS val 3685734319 ecr 104182102], length 1692: BGP
22:14:17.685797 IP 192.168.240.3.41699 > 192.168.240.1.bgp: Flags [P.], seq 245:264, ack 116, win 243, options [nop,nop,TS val 104241092 ecr 3685676806], length 19: BGP
22:14:17.685865 IP 192.168.240.1.bgp > 192.168.240.3.41699: Flags [.], ack 264, win 969, options [nop,nop,TS val 3685735795 ecr 104241092], length 0
22:14:22.792066 ARP, Request who-has 192.168.240.1 tell 192.168.240.3, length 46
22:14:22.792097 ARP, Reply 192.168.240.1 is-at 00:50:56:2a:c2:9f (oui Unknown), length 28

For the “bad” external router, it looks like it’s retrying to send the routes since there was no ack and then doing a lookup (and getting the correct address). Then it retries sending again, followed by another lookup.

A packet capture at the “bad” router during this only shows the lookup traffic.

Out of the blue I would verify that bgp password and route-maps are as expected but as you previously said when the vmware box 1 goes down everything works fine with vmware box 2 (which gives that we can rule out any password or route-maps related issues)?

My best guess based on your latest post is that the vrrp or whatever is being used on the vmware side needs some tweaking.

Like since they work in a pair(?) perhaps they reuse each other mac-addresses or something like that?

If I remember correctly when it comes to for example VRRP one need some additional config to make the passive unit reply properly.

No bgp password or route maps in use so nothing to check there (it’s a lab so its a very simple setup). The NSX edge devices are also not paired. They are part of separate installations and have nothing to do with each other.

And from the VyOS box you can ping both NSX devices and they do show up with different mac-addresses according to the arp table?

Could it be some VLAN tagging thingy going on?

Because as you said if you shutdown NSX1 then NSX2 will happily communicate with your VyOS box and accept the BGP data (or am I wrong)?

Years ago Abit or if it was Asus had an issue with built in NIC’s using the same mac address across different motherboards - that was fund to encounter in the wild :slight_smile:

Yes, can ping both NSX Edge devices and they have different MAC addresses:

vyos@vyos:~$ ping 192.168.140.3
PING 192.168.140.3 (192.168.140.3) 56(84) bytes of data.
64 bytes from 192.168.140.3: icmp_seq=1 ttl=64 time=1.25 ms
64 bytes from 192.168.140.3: icmp_seq=2 ttl=64 time=0.953 ms
64 bytes from 192.168.140.3: icmp_seq=3 ttl=64 time=0.933 ms
^C
--- 192.168.140.3 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 0.933/1.044/1.247/0.143 ms

vyos@vyos:~$ ping 192.168.240.3
PING 192.168.240.3 (192.168.240.3) 56(84) bytes of data.
64 bytes from 192.168.240.3: icmp_seq=1 ttl=64 time=0.659 ms
64 bytes from 192.168.240.3: icmp_seq=2 ttl=64 time=0.520 ms
64 bytes from 192.168.240.3: icmp_seq=3 ttl=64 time=0.519 ms
^C
--- 192.168.240.3 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2029ms
rtt min/avg/max/mdev = 0.519/0.566/0.659/0.065 ms

vyos@vyos:~$ arp -a |egrep '140.3|240.3'
? (192.168.140.3) at 00:50:56:9d:9e:db [ether] on eth1.140
? (192.168.240.3) at 00:50:56:b9:2c:41 [ether] on eth1.240

Something I’ve recently picked up on in the logs on VyOS might be a clue but I don’t know how to interpret it. I’m seeing the following repeatedly for whichever NSX Edge is currently not receiving routes:

Jul 25 13:23:24 vyos bgpd[940]: [H27WP-RKYEA] create subgroup u16:s150
Jul 25 13:23:24 vyos bgpd[940]: [J17A0-KD5QR] peer 192.168.240.3 added to subgroup s150
Jul 25 13:23:24 vyos bgpd[940]: [WRRRF-XQPX7] u16:s150 add peer 192.168.240.3
Jul 25 13:23:25 vyos bgpd[940]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 192.168.240.3 in vrf default
Jul 25 13:23:25 vyos bgpd[940]: [HJ5WV-GHWVE] peer 192.168.240.3 deleted from subgroup s150 peer cnt 0
Jul 25 13:23:25 vyos bgpd[940]: [J17A0-KD5QR] peer 192.168.240.3 added to subgroup s147
Jul 25 13:23:25 vyos bgpd[940]: [SJ78S-X8R31] u16:s150 (1 peers) merged into u16:s147, trigger: advanced peer in queue
Jul 25 13:23:25 vyos bgpd[940]: [SMH5H-KG5HW] delete subgroup u16:s150

For the NSX Edge that is receiving routes, I only see messages like these when I disable BGP on the NSX Edge. The above message was seen in the messages log on VyOS right as it was trying to send routes to the NSX Edge that peered second. This block of messages repeats every three minutes when VyOS again tries to push routes and gets no ack from the NSX Edge.

I think I know what the logs are saying, just that the peer connection to 240.3 was being reestablished. I can see in the show bgp summary output that the connection to 240.3 is never up for more than 3 minutes before it gets reestablished. Is this expected behavior on the part of BGP when it doesn’t get an ack back from a peer when trying to advertise routes or could there be something else going on here?