VyOS won't establish BFD connection to bird

I have configured BFD in bird and it works well between bird instances:

protocol bfd
{
        interface "local-ibgp" {
                min rx interval 100 ms;
                min tx interval 100 ms;
                idle tx interval 500 ms;
                multiplier 10;
                # disabled because VyOS doesn't seem to support?
                #password "dsfgsdfg";
        };

        neighbor 172.20.215.130 local 172.20.215.129 multihop; # another bird box
        neighbor 172.20.215.131 local 172.20.215.129 multihop; # VyOS
}

However, my connection with VyOS is stuck with “Init” on bird side and VyOS side even says down:

vyos@SunGate1# show protocols bfd
 peer 172.20.215.129 {
     interval {
         multiplier 10
         receive 100
         transmit 100
     }
     multihop
     source {
         address 172.20.215.131
     }
 }
[edit]
vyos@SunGate1# run show bfd peers 
BFD Peers:
        peer 172.20.215.129 multihop local-address 172.20.215.131 vrf default
                ID: 3123627273
                Remote ID: 0
                Active mode
                Minimum TTL: 254
                Status: down
                Downtime: 9 minute(s), 50 second(s)
                Diagnostics: ok
                Remote diagnostics: ok
                Peer Type: configured
                RTT min/avg/max: 0/0/0 usec
                Local timers:
                        Detect-multiplier: 10
                        Receive interval: 100ms
                        Transmission interval: 100ms
                        Echo receive interval: 50ms
                        Echo transmission interval: disabled
                Remote timers:
                        Detect-multiplier: 3
                        Receive interval: 1000ms
                        Transmission interval: 1000ms
                        Echo receive interval: disabled

[edit]

Any advice?

Sorry checking in on this topic again. Still not resolved. Any advice or tips how to even debug this?

From their description BIRD User's Guide: Protocols

BIRD implements basic BFD behavior as defined in RFC 5880 (some advanced features like the echo mode or authentication are not implemented), IP transport for BFD as defined in RFC 5881 and RFC 5883 and interaction with client protocols as defined in RFC 5882.

Note that BFD implementation in BIRD is currently a new feature in development, expect some rough edges and possible UI and configuration changes in the future. Also note that we currently support at most one protocol instance.

BFD packets are sent with a dynamic source port number. Linux systems use by default a bit different dynamic port range than the IANA approved one (49152-65535). If you experience problems with compatibility, please adjust /proc/sys/net/ipv4/ip_local_port_range.

BFD destination port should be UDP 3784 and would need to be allowed inbound. BFD timer should match on both sides but if they don’t BFD will adjust to the slowest. At least on Juniper it does…

@Viacheslav thanks … indeed, I stumbled across this port range already and adjusted but there was bno difference:

# cat /proc/sys/net/ipv4/ip_local_port_range
49152   65535

@Borgermeister I disabled firewall (ufw) temporarily. No change.

172.20.215.129 is the IP of the bird box (Linux) and 172.20.215.131 of VyOS:

$ sudo tcpdump -i wg-bgate1 'udp and host 172.20.215.131' -n
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on wg-bgate1, link-type RAW (Raw IP), snapshot length 262144 bytes
09:50:44.160554 IP 172.20.215.129.61943 > 172.20.215.131.4784: BFDv1, Multihop, State Init, Flags: [none], length: 24
09:50:44.332434 IP 172.20.215.131.49152 > 172.20.215.129.4784: BCM-LI-SHIM: direction unused, pkt-type unknown, pkt-subtype single VLAN tag, li-id 2584
09:50:44.981350 IP 172.20.215.129.61943 > 172.20.215.131.4784: BFDv1, Multihop, State Init, Flags: [none], length: 24
09:50:45.292670 IP 172.20.215.131.49152 > 172.20.215.129.4784: BCM-LI-SHIM: direction unused, pkt-type unknown, pkt-subtype single VLAN tag, li-id 2584
09:50:45.874918 IP 172.20.215.129.61943 > 172.20.215.131.4784: BFDv1, Multihop, State Init, Flags: [none], length: 24
09:50:46.212687 IP 172.20.215.131.49152 > 172.20.215.129.4784: BCM-LI-SHIM: direction unused, pkt-type unknown, pkt-subtype single VLAN tag, li-id 2584
09:50:46.651933 IP 172.20.215.129.61943 > 172.20.215.131.4784: BFDv1, Multihop, State Init, Flags: [none], length: 24
09:50:47.082455 IP 172.20.215.131.49152 > 172.20.215.129.4784: BCM-LI-SHIM: direction unused, pkt-type unknown, pkt-subtype single VLAN tag, li-id 2584
09:50:47.538323 IP 172.20.215.129.61943 > 172.20.215.131.4784: BFDv1, Multihop, State Init, Flags: [none], length: 24
09:50:47.993083 IP 172.20.215.131.49152 > 172.20.215.129.4784: BCM-LI-SHIM: direction unused, pkt-type unknown, pkt-subtype single VLAN tag, li-id 2584
09:50:48.316095 IP 172.20.215.129.61943 > 172.20.215.131.4784: BFDv1, Multihop, State Init, Flags: [none], length: 24
09:50:48.893146 IP 172.20.215.131.49152 > 172.20.215.129.4784: BCM-LI-SHIM: direction unused, pkt-type unknown, pkt-subtype single VLAN tag, li-id 2584
09:50:49.130808 IP 172.20.215.129.61943 > 172.20.215.131.4784: BFDv1, Multihop, State Init, Flags: [none], length: 24
09:50:49.823787 IP 172.20.215.131.49152 > 172.20.215.129.4784: BCM-LI-SHIM: direction unused, pkt-type unknown, pkt-subtype single VLAN tag, li-id 2584
09:50:49.966059 IP 172.20.215.129.61943 > 172.20.215.131.4784: BFDv1, Multihop, State Init, Flags: [none], length: 24

Does this give any clue?
It seems bird box sends “BFDv1” packets (Source port 61943, Dest Port 4784 – different from 3784 as you said – typo??) … but no such packets come from VyOS.

Instead, VyOS sends “BCM-LI-SHIM: direction unused, pkt-type unknown, pkt-subtype single VLAN tag, li-id 2584” packets…

Sorry. Did not notice the multihop :slight_smile:

(From Juniper web site)
Single-hop BFD control packets use UDP port 3784.
Multihop BFD—One desirable application of BFD is to detect connectivity to routing devices that span multiple network hops and follow unpredictable paths. This is known as a multihop session. Multihop BFD control packets use UDP port 4784.

Ah got it.

Ok, I am little bit further and VyOS connects successfully if the session is direct. But when it’s multihop (this is for iBGP where I use dummy interfaces) it’s still stuck in Init…

# show protocols bfd | strip-private 
 peer xx.xx.56.177 {
     interval {
         multiplier 10
         receive 100
         transmit 100
     }
     source {
         interface wg1
     }
 }
 peer xx.xx.56.185 {
     interval {
         multiplier 10
         receive 100
         transmit 100
     }
     source {
         interface wg0
     }
 }
 peer 172.20.215.129 {
     interval {
         multiplier 10
         receive 100
         transmit 100
     }
     multihop
     source {
         address 172.20.215.131
     }
 }
# show interfaces dummy 
 /* Dummy interface for OSPF traffic */
 dummy dum0 {
     address 172.20.215.131/32
 }

On the bird box:

# birdc show bfd sess
BIRD 2.0.8 ready.
bfd1:
IP address                Interface  State      Since         Interval  Timeout
172.20.215.131            ---        Init       10:29:43.808    1.000   10.000
xx.xx.56.186              wg-bgate1  Up         10:29:37.443    0.100    1.000

I’ve just tried to configure multihop BFD between two VyOS gateways over internet without any luck.

The firewall log shows that both gateways tries to send UDP packets on port 4784

[ipv4-INP-filter-999-A]IN=eth0.2000 OUT= MAC=bc:24:xx:6e:fb:d2:70:61:7b:dd:e8:f4:xx:00:45:00:00:34 SRC=xxx.xxx.202.164 DST=xxx.xxx.106.157 LEN=52 TOS=0x00 PREC=0x00 TTL=243 ID=36086 DF PROTO=UDP SPT=49155 DPT=4784 LEN=32
BFD Peers:
        peer xxx.xxx.202.164 multihop local-address xxx.xxx.106.157 vrf default
                ID: 861362063
                Remote ID: 0
                Active mode
                Minimum TTL: 254
                Status: down
                Downtime: 3 minute(s), 22 second(s)
                Diagnostics: ok
                Remote diagnostics: ok
                Peer Type: configured
                RTT min/avg/max: 0/0/0 usec
                Local timers:
                        Detect-multiplier: 3
                        Receive interval: 300ms
                        Transmission interval: 300ms
                        Echo receive interval: 50ms
                        Echo transmission interval: disabled
                Remote timers:
                        Detect-multiplier: 3
                        Receive interval: 1000ms
                        Transmission interval: 1000ms
                        Echo receive interval: disabled

@Borgermeister so you’re saying multihop does not work for you either or is works and the reason it does not is the firewall?

What happens if you disable the firewall / create an accept for port 4784?

Again, in my case I tried turning it off completely (ufw disable). Also in my case, packages are exchanged (see above) and the state is not “Down” but stuck at “Init” … that suggests something more intricate is going on…

This is working for me (just replace your IP):

vyos@vyos# show | commands | match bfd
set protocols bfd peer 10.0.0.2 multihop
set protocols bfd peer 10.0.0.2 profile 'TEST'
set protocols bfd peer 10.0.0.2 source address '10.0.0.1'
set protocols bfd profile TEST interval multiplier '3'
set protocols bfd profile TEST interval receive '300'
set protocols bfd profile TEST interval transmit '300'

vyos@vyos# run show bfd peers brief 
Session count: 1
SessionId  LocalAddress                             PeerAddress                             Status         
=========  ============                             ===========                             ======         
1337796944 10.0.0.1                                 10.0.0.2                                up     

Still not, sadly… :frowning:
Tried exactly that…

Debugged a little more, running tcpdump on both bird and VyOS side. What happens is that both do sent packets. However, VyOS seems to ignore the received packages because it always sends “Your Discriminator: 0x00000000”.

While bird successfully received the packets because it identifies VyOS’s discriminator and uses it in its packages: “My Discriminator: 0x9c259fb1, Your Discriminator: 0x00000000”.

Note: 172.20.215.131=VyOS, 172.20.215.130=Bird

How in the world??

# tcpdump -i wg1 'udp and port 4784' -v
tcpdump: listening on wg1, link-type RAW (Raw IP), snapshot length 262144 bytes
22:17:47.555681 IP (tos 0xc0, ttl 255, id 45521, offset 0, flags [DF], proto UDP (17), length 52)
    172.20.215.131.49156 > 172.20.215.130.4784: BFDv1, length: 24
        Multihop, State Down, Flags: [none], Diagnostic: No Diagnostic (0x00)
        Detection Timer Multiplier: 10 (10000 ms Detection time), BFD Length: 24
        My Discriminator: 0x9c259fb1, Your Discriminator: 0x00000000
          Desired min Tx Interval:    1000 ms
          Required min Rx Interval:   1000 ms
          Required min Echo Interval:   50 ms
22:17:48.115742 IP (tos 0xc0, ttl 64, id 53492, offset 0, flags [none], proto UDP (17), length 52)
    172.20.215.130.61119 > 172.20.215.131.4784: BFDv1, length: 24
        Multihop, State Init, Flags: [none], Diagnostic: No Diagnostic (0x00)
        Detection Timer Multiplier: 5 (5000 ms Detection time), BFD Length: 24
        My Discriminator: 0x5e92c12f, Your Discriminator: 0x9c259fb1
          Desired min Tx Interval:    1000 ms
          Required min Rx Interval:     10 ms
          Required min Echo Interval:    0 ms
22:17:48.516166 IP (tos 0xc0, ttl 255, id 45982, offset 0, flags [DF], proto UDP (17), length 52)
    172.20.215.131.49156 > 172.20.215.130.4784: BFDv1, length: 24
        Multihop, State Down, Flags: [none], Diagnostic: No Diagnostic (0x00)
        Detection Timer Multiplier: 10 (10000 ms Detection time), BFD Length: 24
        My Discriminator: 0x9c259fb1, Your Discriminator: 0x00000000
          Desired min Tx Interval:    1000 ms
          Required min Rx Interval:   1000 ms
          Required min Echo Interval:   50 ms
22:17:48.930006 IP (tos 0xc0, ttl 64, id 53601, offset 0, flags [none], proto UDP (17), length 52)
    172.20.215.130.61119 > 172.20.215.131.4784: BFDv1, length: 24
        Multihop, State Init, Flags: [none], Diagnostic: No Diagnostic (0x00)
        Detection Timer Multiplier: 5 (5000 ms Detection time), BFD Length: 24
        My Discriminator: 0x5e92c12f, Your Discriminator: 0x9c259fb1
          Desired min Tx Interval:    1000 ms
          Required min Rx Interval:     10 ms
          Required min Echo Interval:    0 ms

I GOT IT!!!

Whooooah! It’s the TTL! Freakin’ TTL, who would have thought!

Seems Linux/bird uses TTL64 by default (that was just the only difference I saw in the tcpdump).

And key is section 5 of this RFC 5881 - Bidirectional Forwarding Detection (BFD) for IPv4 and IPv6 (Single Hop)

Temporary fix via sysctl -w net.ipv4.ip_default_ttl="255" but seems I need to bring this up as a bug on bird …

Multihop is not working. Perhaps I was a bit unclear :grinning:
Both VyOS both send and receive BFD on UDP 4784. Had to create a firewall rule in the input filter.
Haven’t tried tcpdump to see what is happening.

@Borgermeister I think you may see then the same issue as I. I believe this is a bug.

@L0crian In your test, are the machines directly connected or is there a hop in between? Could you confirm what the TTL is?

Here is why I think this is a bug in VyOS and not bird (and I am surprised to be the only person using BFD in multihop…):

  • The RFC, Section 5 I mentioned applies to directly connected peers only. VyOS does not seem to enforce this.
  • However, in multihop config, TTL can be any value
  • VyOS discards packets if TTL is not 254 or 255, even for multi hop config!
  • Linux standard TTL is 64 and this is also IETF recommendation. So any packet will use TTL 64 if not otherwise specified and after 4 hops this would be down to 60. This also applied to the packets generated by bird.
  • I confirmed with show bfd peers that it shows up as multihop: peer 172.20.215.129 multihop local-address 172.20.215.131 vrf default
  • FRR has the option minimum-ttl but it is not exposed to VyOS. It always shows Minimum TTL: 254. This makes it more or less impossible to get multihop config working properly

Add a bug report/feature request with simple “set” of commands to reproduce.
Also provide a working FRR configuration and we’ll implement this.

They were directly connected, and built to a loopback. Adding 6 hops between definitely broke it as you said:

vyos@vyos# run traceroute 10.0.0.2 source-address 10.0.0.1
traceroute to 10.0.0.2 (10.0.0.2), 30 hops max, 60 byte packets
 1  10.1.3.3 (10.1.3.3)  1.601 ms  1.224 ms  0.715 ms
 2  10.3.4.4 (10.3.4.4)  2.604 ms  2.191 ms  1.848 ms
 3  10.4.5.5 (10.4.5.5)  7.804 ms  12.549 ms  12.234 ms
 4  10.5.7.7 (10.5.7.7)  11.839 ms  11.665 ms  11.357 ms
 5  10.6.7.6 (10.6.7.6)  11.074 ms  10.919 ms  10.706 ms
 6  10.6.8.8 (10.6.8.8)  10.657 ms  8.252 ms  8.036 ms
 7  10.0.0.2 (10.0.0.2)  7.859 ms  7.685 ms  7.488 ms

vyos@vyos# run show bfd peers brief 
Session count: 1
SessionId  LocalAddress                             PeerAddress                             Status         
=========  ============                             ===========                             ======         
1846334680 10.0.0.2                                 10.0.0.1                                down 

Dropping into FRR and adjusting the minimum ttl fixed it:

vyos@vyos# run show bfd peers brief 
Session count: 1
SessionId  LocalAddress                             PeerAddress                             Status         
=========  ============                             ===========                             ======         
1846334680 10.0.0.2                                 10.0.0.1                                down           

vyos@vyos# sudo vtysh
vyos(config)# bfd
vyos(config-bfd)# profile TEST
vyos(config-bfd-profile)# minimum-ttl 240
vyos(config-bfd-profile)# end
vyos# exit

vyos@vyos# run show bfd peers brief 
Session count: 1
SessionId  LocalAddress                             PeerAddress                             Status         
=========  ============                             ===========                             ======         
1846334680 10.0.0.2                                 10.0.0.1                                up   

This is from FRR’s documentation. They seem to state that the inclusion of the “multihop” switch should allow for TTLs less than 254 without any additional configuration.

multihop tells the BFD daemon that we should expect packets with TTL less than 254 (because it will take more than one hop) and to listen on the multihop port (4784). When using multi-hop mode echo-mode will not work (see RFC 5883 section 3).

The minimum ttl command seems to be intended as a security feature to prevent rogue BFD sessions from across the world when enabling multihop.

minimum-ttl (1-254)
For multi hop sessions only: configure the minimum expected TTL for an incoming BFD control packet.

This feature serves the purpose of thightening the packet validation requirements to avoid receiving BFD control packets from other sessions.

The default value is 254 (which means we only expect one hop between this system and the peer).

It seems like this may be a bug in FRR, though it may just be inconstant documentation and minimum ttl is always expected to be configured. I think exposing minimum ttl to the VyOS CLI is a good idea either way.