seems like that all my peers are resetting BGP session after 3 minutes. Like my VyOS router is not sending any keepalive.
Any idea about what can I check to debug this issue?
I didn’t change the defaul timers for any of my neighbors, but even if I set the keepalive timer to 60 seconds for some neighbors it doesn’t change anything…
all my peers are connected via Wireguard (it’s a DN42 network) and BGP is configured to use link-local IPv6 addresses with multiprotocol enabled.
I can consistently ping all my peers in any moment (before, during and after BGP session goes up and prefixes are exchanged).
I tried to tcpdump some interfaces and I can see that just after the OPEN message, A(my peer) sends a KEEPALIVE packet to B (me) immediately followed by another KEEPALIVE message sent from B to A.
After that I can only see UPDATE messages and some KEEPALIVE message sent every about 60 seconds from A to B and only ACKed from B (B never sends out other KEEPALIVE messages, it only ACKs the ones received from A).
After a while, A closes the connection and another cycle is started, with another exchange of OPEN messages.
I think it’s worth notice that I can see “clusters” of TCP errors in between the valid packets. Errors like “TCP previous segment not captured”, “TCP Dup ACK”, “TCP Out-Of-Order”, “TCP Retransmission”.
I cannot say if they are actually the cause of my issue, because I never tried a tcpdump when all was working fine, but doesn’t seem strange that my router never sends KEEPALIVE messages but it only ACKs the one it receives?
Try to check the routes that the router receives.
Is it possible that the router receives more-spec prefixes which used as neighbor
for example, route to peer x.x.x.x is known via some multihop (or other protocol) and when the session is established router receives the same x.x.x.x, but with more priority via bgp, so the session goes down.
It’s worth checking out
I don’t think it’s the case for two reasons: 1) I’m using link local ipv6 to connect with my peers, so BGP knows exactly which IP and which interface to use to reach the peer. 2) prefixes are accepted only if correctly assigned to a specific ASN, and link-local addresses are forbidden by the registry.
In any case I checked, just to be sure, and I can confirm that no fe80:: prefix is received via BGP.
They are only directly connected routes.
I just tried that, but unfortunately nothing changed…
Here an extract of what I can see analyzing the packets.
fe80::ade0 is the remote peer.
fe80::42:688:42:3914 is my VyOS router.