Reboot causes BGP route map to drop before BGP session, tripping peer prefix filters

Hello,

There has been a bug in VyOS for some time now. It occurs when the router is rebooted. What seems to happen is the route maps applied to each peer are dropped before the sessions and killed. The end result is that the session trips max prefix filters on the peer’s router and that usually requires the session is cleared by the peer before it can be re-established.

I’ve found a way around is to disable the interface before the reboot. sudo ifconfig eth6 down, then reboot.

Another possible work around may be to manually shutdown the peer before the reboot.

vyos@vyos# set protocols bgp 65001 neighbor 10.10.1.1 shutdown 

vyos@vyos:~$ show ip bgp neighbors 10.10.1.1 
BGP neighbor is 10.10.1.1, remote AS 65001, local AS 65001, internal link
Administratively shut down

You are correct, that is a work around. Except we have 163 peers on one router alone. Running 8 routers in our confederation. not practical to shut each down individually

1 Like

Is there a phabricator issue for this? I cant seem to find it… and have ran into this issue more than once myself - and have seen it in 1.2.1

https://phabricator.vyos.net/T944

even with a reference to this thread.

Thanks @rob - ill see if I can get a repo 1.2.1-S2 and report back.

We ran into this issue as well. Tried to shutdown all sessions before reboot, but even then some routes are leaked. Has anyone found a successfull workaround to circumvent this already?

@andri can you check this option on test env?

set protocols bgp xxxx parameters default no-ipv4-unicast
set protocols bgp xxxx neighbor x.x.x.x address-family ipv4-unicast export RMAP-XXX-OUT

You need to reboot the router after this command.

1 Like

I sold the company so I don’t have this issue any more :slight_smile:

But the work around I ended up doing was to shut down those interfaces and then I had a script bring them back up after reboot (with a delay)

Surprised this hasn’t been fixed already.

1 Like

I’m unable to reproduce this problem on my test environment as of now. Maybe it occurs when redistributing routes learning through iBGP, but i tried to emulate my internet routing table with some static routes generated in a loop.

Also, i don’t have the set protocols bgp xxxx neighbor x.x.x.x address-family ipv4-unicast export command (VyOS 1.2.5).

Will try to adapt my test environment with anyother instance which will inject all routes into my IGP now.

Put the neighbor in “address-family ipv4-unicast”
If you will be using this parameter “parameters default no-ipv4-unicast”

Not sure if i understood correctly, but there is no neighbor in set protocols bgp 65534 address-family ipv4-unicast either :thinking:

How do you guys test the BGP implementation against a full table? I tried to simulate a upstream with another VyOS instance but adding tousands of static routes proofed to be slow going, so i think there has to be a better tool to mock those BGP messages, however my internet search came up with decade olds scientific papers only.

@andri

set protocols bgp 65534 neighbor 10.0.0.1 address-family ipv4-unicast

For test FullView, we establish a real session with neighbor. If you don’t have a direct link, you can use tunnel gre interfaces.
Of course, as a test lab you can origin 40-500 prefixes and on another side parameter “maximum prefix limit”

Reboot R2 and see on R3 if bgp session will shutdown the neigbor with “max prefix limits”

Thanks. I emulated 2.5k test routes in my lab which was not enough to trigger the bug. Will connect to a real neighbor next and conduct further tests then.

@andri Try to add some neighbors. They may not be real but just be present in the configuration.
8-10 neighbors and real session with only one.

1 Like

I finally took the time to catch up with this issue again.

ll my peers have address-family ipv4-unicast set already, because i use export and import route maps like this:

andri@core01# show protocols bgp XXX neighbor XXXX
 address-family {
     ipv4-unicast {
         route-map {
             import ImportRouteMap
             export ExportRouteMap
         }
     }
 }

Tried this, and also built the setup you outlined in this post with r1, r2 and r3 but was not able to trigger the problem.

I’ll connect one of my real routes to the test environment now.

I’m not able to reproduce this with a real neighbor connected to the BGP core with all routes anymore. Will leave it as it is for now and hope this does not happen anymore.