OSPF w/ BFD issues interop with Mikrotik?

hi,

I have been experiencing issues with OSPF with BFD using VYOS when peering with Mikrotiks. Scenario below:

  • Tried latest RR and LTS version, no difference.

  • VYOS peers with Juniper MX80, Mikrotik device using ROS6 and ROS7

  • Identical config applies to all BFD and OSPF neighbor

  • Peers right away with Juniper MX80, no problem. I tried flapping the interfaces on one side and the session keeps coming back without any issue. BFD and OSPF session is stable.

  • Peering with ROS6 and ROS7 exhibit random weirdness. Sometimes peering will form right away. ROS6 issue: Once peering is formed, I tested flapping the interfaces and the BFD session drops then won’t form anymore. Same result if I reboot one device. Removing/re-adding the config at VYOS sometimes fixes it. My workaround for this for now is just to remove BFD and just lower down OSPF timers. OSPF session here seems stable compared to the issue below with ROS7. I also run a pcap both devices, and I can see BFD traffic coming from other end. Mikrotik ROS6 device is saying “Control Detection Time Expired” while VYOS says “Neighbor Signaled Session Down” . I am not using BFD echo mode in vyos but I notice one thing in the bfd traffic it sends, there is “Required Min Echo Interval” value of 50ms, while Mikrotik has 0 value. So I wonder if this is the issue but I don’t use echo mode so why does VYOS do that?

  • Issue for ROS7 is that session wont form and it says seqnumbermismatch. Switching the network type from point-to-point to broadcast fixes the problem. ROS7 dont have BFD so I am using fast hello and dead timers. 1 hello, 2 dead timer. OSPF just randomly drops too without any reason. This is the log. default-v2 { version: 2 router-id: backbone-v2 { 0.0.0.0 } interface { broadcast 10.221.3.78%ether1.3074 } neighbor { router-id: 10.221.2.131 state: Full } state change to Init

  • Devices are interconnected in a switch via 10G DACs.

VYOS OSPF

set protocols ospf area 0 network ‘10.221.3.76/30’
set protocols ospf area 0 network ‘10.221.3.68/30’
set protocols ospf area 0 network ‘10.221.0.28/30’
set protocols ospf area 0 network ‘10.221.0.32/30’
set protocols ospf area 0 network ‘10.221.3.72/30’
set protocols ospf auto-cost reference-bandwidth ‘10000’
set protocols ospf log-adjacency-changes
set protocols ospf parameters abr-type ‘cisco’
set protocols ospf parameters router-id ‘10.221.2.131’
set protocols ospf redistribute connected metric-type ‘2’


Config facing MX80

set interfaces ethernet eth4 vif 3042 ip ospf bfd
set interfaces ethernet eth4 vif 3042 ip ospf cost ‘1’
set interfaces ethernet eth4 vif 3042 ip ospf dead-interval ‘40’
set interfaces ethernet eth4 vif 3042 ip ospf hello-interval ‘10’
set interfaces ethernet eth4 vif 3042 ip ospf network ‘point-to-point’
set interfaces ethernet eth4 vif 3042 ip ospf priority ‘1’
set interfaces ethernet eth4 vif 3042 ip ospf retransmit-interval ‘5’
set interfaces ethernet eth4 vif 3042 ip ospf transmit-delay ‘1’
set protocols bfd peer 10.221.0.30 interval multiplier ‘5’
set protocols bfd peer 10.221.0.30 interval receive ‘100’
set protocols bfd peer 10.221.0.30 interval transmit ‘100’


Config facing Mikrotik ROS6

set interfaces ethernet eth4 vif 3072 ip ospf bfd
set interfaces ethernet eth4 vif 3072 ip ospf cost ‘1’
set interfaces ethernet eth4 vif 3072 ip ospf dead-interval ‘40’
set interfaces ethernet eth4 vif 3072 ip ospf hello-interval ‘10’
set interfaces ethernet eth4 vif 3072 ip ospf network ‘point-to-point’
set interfaces ethernet eth4 vif 3072 ip ospf priority ‘1’
set interfaces ethernet eth4 vif 3072 ip ospf retransmit-interval ‘5’
set interfaces ethernet eth4 vif 3072 ip ospf transmit-delay ‘1’
set protocols bfd peer 10.221.3.70 interval multiplier ‘5’
set protocols bfd peer 10.221.3.70 interval receive ‘100’
set protocols bfd peer 10.221.3.70 interval transmit ‘100’
set protocols bfd peer 10.221.3.70 source interface ‘eth4.3072’ → added optional because OSPF won’t form at first.

Mikrotik config:

routing ospf interface add cost=65000 interface=sfp1.3072 network-type=point-to-point use-bfd=yes
routing bfd interface add interface=sfp1.3072 interval=0.1s min-rx=0.1s multiplier=5

Config facing Mikrotik ROS7

set interfaces ethernet eth4 vif 3074 ip ospf cost ‘1’
set interfaces ethernet eth4 vif 3074 ip ospf dead-interval ‘2’
set interfaces ethernet eth4 vif 3074 ip ospf hello-interval ‘1’
set interfaces ethernet eth4 vif 3074 ip ospf network ‘broadcast’
set interfaces ethernet eth4 vif 3074 ip ospf priority ‘1’
set interfaces ethernet eth4 vif 3074 ip ospf retransmit-interval ‘5’
set interfaces ethernet eth4 vif 3074 ip ospf transmit-delay ‘1’
set interfaces ethernet eth4 vif 3074 mtu ‘9000’

Mikrotik ROS7 config

/routing ospf interface-template add area=backbone-v2 dead-interval=2s disabled=no hello-interval=1s interfaces=ether1.3074 networks=10.221.3.76/30

I see no error so far. I’m testing in virtual lab, with 2 vyos (LTS and rolling 1.4) and Mikrotik chr 6.49.2.

All devices connected to a switch using network 172.16.20.0/24.

  • vyoa 1.3.2 → 172.16.20.103/24
  • vyos rolling 1.4 → 172.16.20.102/24
  • Mikrotik → 172.16.20.101/24

Relevant config for vyos 1.3.2:

vyos@vyso-132# run show int
Codes: S - State, L - Link, u - Up, D - Down, A - Admin Down
Interface        IP Address                        S/L  Description
---------        ----------                        ---  -----------
dum0             192.168.0.33/32                   u/u  
eth0             172.16.20.103/24                  u/u  
eth1             203.0.113.1/24                    u/u  
eth2             -                                 u/u  
eth3             -                                 u/u  
lo               127.0.0.1/8                       u/u  
                 ::1/128                                
[edit]
vyos@vyso-132# run show config comm | grep "ospf\|bfd"
set interfaces ethernet eth0 ip ospf bfd
set protocols bfd peer 172.16.20.101 interval multiplier '5'
set protocols bfd peer 172.16.20.101 interval receive '100'
set protocols bfd peer 172.16.20.101 interval transmit '100'
set protocols bfd peer 172.16.20.102 interval multiplier '5'
set protocols bfd peer 172.16.20.102 interval receive '100'
set protocols bfd peer 172.16.20.102 interval transmit '100'
set protocols ospf area 0 network '172.16.20.0/24'
set protocols ospf parameters router-id '192.168.0.33'
set protocols ospf redistribute connected metric-type '2'
[edit]

### BFD status: just filtering output to show status of both peers:
vyos@vyso-132# run show bfd peers | grep "Status\|Uptime"
                Status: up
                Uptime: 37 minute(s), 15 second(s)
                Status: up
                Uptime: 9 minute(s), 8 second(s)

Peer with minor uptime is Mikrotik, because I enable/disabled several times Mikrotik interface, and peer always went down/up as expected

And on Mikrotik side, you can check bfd neighbour status, and some changes may be required in your setup:

[admin@MikroTik] > /routing bfd neighbor print detail 
Flags: U - up 
 0 U interface=ether2 address=172.16.20.103 protocols=ospf multihop=no state=up state-changes=4 uptime=12m13s desired-tx-interval=0.2s actual-tx-interval=0.2s required-min-rx=0.2s remote-min-rx=0.1s multiplier=5 hold-time=1s packets-rx=4194 
     packets-tx=4543 

 1 U interface=ether2 address=172.16.20.102 protocols=ospf multihop=no state=up state-changes=1 uptime=12m10s desired-tx-interval=0.2s actual-tx-interval=0.2s required-min-rx=0.2s remote-min-rx=0.1s multiplier=5 hold-time=1s packets-rx=4164 
     packets-tx=4523 
[admin@MikroTik] > 

Hi Nick,

Thanks for your response.

Somehow, our setup is a bit different. I use network type point-to-point to bypass DR/BDR election. Also, I notice on your CHR, your timers are set to 200ms, while VYOS are set to 100ms. Any reason why you intentionally did not made it the same?

My BFD with OSPF session were stable for 5 days between VYOS LTS and Mikrotik ROS6.49.3, then I flapped the interface today, and I get the same problem again. See logs.


image

I flapped the interfaces multiple times, and it did not came back. I then set the timers on the Mikrotik to 200ms, and it came back. Attempted to reproduce the issue with 100ms setting on vyos and 200ms setting on Mikrotik by flapping interfaces & removing VLAN on the L2 switch,

Adjusted VYOS to use 200ms timers, attempted to reproduce the issue, and I am unable anymore. So seems like aggressive timers is the cause of the issue.

Have you done some testing for ROS7 too?