Odd routing behaviour with DMVPN + BGP RR + OSPF redistribution

Hi there,

I’m testing the DMVPN solution provided on the Link below:
https://whiskeyalpharomeo.com/2016/09/16/vyos-testing-dmvpn/

I’m facing a routing problem with routes redistributed from OSPF into BGP protocol as you can see below:

vyos@wrw-rtr-vyos1:~$ show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF,
       I - ISIS, B - BGP, > - selected route, * - FIB route

S>* 0.0.0.0/0 [1/0] via 10.170.191.254, eth0
C>* 4.0.0.0/32 is directly connected, lo
O   5.8.128.0/24 [110/10] is directly connected, eth1, 02w1d20h
C>* 5.8.128.0/24 is directly connected, eth1
O   5.144.48.0/24 [110/10] is directly connected, eth1, 02w1d20h
C>* 5.144.48.0/24 is directly connected, eth1
O   5.149.96.0/24 [110/10] is directly connected, eth2, 02w1d20h
C>* 5.149.96.0/24 is directly connected, eth2
O   5.200.96.0/24 [110/10] is directly connected, eth2, 02w1d20h
C>* 5.200.96.0/24 is directly connected, eth2
B>* 10.5.10.0/24 [200/110] via 24.31.140.242 (recursive via 10.170.191.254), 4d04h53m
B>* 10.5.10.100/32 [200/210] via 24.31.140.242 (recursive via 10.170.191.254), 01w4d23h
B>* 10.5.20.0/24 [200/210] via 24.31.140.242 (recursive via 10.170.191.254), 01w4d23h
S>* 10.154.0.0/16 [1/0] via 10.170.191.254, eth0
S>* 10.170.0.0/16 [1/0] via 10.170.191.254, eth0
C>* 10.170.128.0/18 is directly connected, eth0
S>* 10.200.0.0/16 [1/0] via 10.170.191.254, eth0
S>* 10.208.0.0/16 [1/0] via 10.170.191.254, eth0
S>* 10.232.0.0/16 [1/0] via 10.170.191.254, eth0
S>* 10.248.0.0/16 [1/0] via 10.170.191.254, eth0
B>* 24.30.1.0/24 [200/1] via 192.168.199.5, tun0, 02w1d20h
B>* 24.31.140.0/24 [200/1] via 192.168.199.5, tun0, 02w1d20h
C>* 127.0.0.0/8 is directly connected, lo
S>* 172.18.0.0/16 [1/0] via 10.170.191.254, eth0
S>* 172.24.0.0/16 [1/0] via 10.170.191.254, eth0
S>* 192.168.163.0/24 [1/0] via 10.170.191.254, eth0
C>* 192.168.199.0/24 is directly connected, tun0

vyos@wrw-rtr-vyos1:~$ show interfaces
Codes: S - State, L - Link, u - Up, D - Down, A - Admin Down
Interface        IP Address                        S/L  Description
---------        ----------                        ---  -----------
eth0             10.170.167.100/18                 u/u  MGMT
eth1             5.8.128.254/24                    u/u  Lebanon/Israel - AS(ME) range
                 5.144.48.254/24
eth2             5.149.96.254/24                   u/u  Iraq/Iran - AS(ME) range
                 5.200.96.254/24
lo               127.0.0.1/8                       u/u
                 4.0.0.0/32
                 ::1/128
tun0             192.168.199.4/24                  u/u

--- The traffic towards 10.5.x.x should flow via tun0, not via eth0
vyos@wrw-rtr-vyos1:~$ traceroute 10.5.10.100
traceroute to 10.5.10.100 (10.5.10.100), 30 hops max, 60 byte packets
 1  10.170.191.254 (10.170.191.254)  0.271 ms  0.356 ms  0.338 ms
 2  <scrubbed>  1.275 ms  1.499 ms  1.258 ms
 3  10.225.3.45 (10.225.3.45)  1.599 ms  1.701 ms  1.801 ms
 4  * * *
 5  * * *
 6  * * *
 7  * * *
 8  * * *
 9  *^C

Here is the HUB BGP/OSPF config:

vyos@wrw-rtr-vyos1# show protocols bgp
 bgp 65000 {
     neighbor 192.168.199.1 {
         peer-group DMVPNPEERS
     }
     neighbor 192.168.199.2 {
         peer-group DMVPNPEERS
     }
     neighbor 192.168.199.3 {
         peer-group DMVPNPEERS
     }
     neighbor 192.168.199.5 {
         peer-group DMVPNPEERS
     }
     network 5.8.128.0/24 {
     }
     network 5.144.48.0/24 {
     }
     network 5.149.96.0/24 {
     }
     network 5.200.96.0/24 {
     }
     parameters {
         router-id 4.0.0.0
     }
     peer-group DMVPNPEERS {
         passive
         remote-as 65000
         route-reflector-client
         soft-reconfiguration {
             inbound
         }
         update-source 192.168.199.4
     }
     redistribute {
         ospf {
         }
     }
 }
[edit]
vyos@wrw-rtr-vyos1# show protocols ospf
 area 0.0.0.0 {
     network 5.0.0.0/8
 }
 parameters {
     abr-type cisco
     router-id 4.0.0.0
 }
 passive-interface eth0
[edit]

Here is SPOKE1 config:

vyos@spk-rtr-vyos1# show protocols bgp
 bgp 65000 {
     neighbor 192.168.199.4 {
         remote-as 65000
         update-source 192.168.199.5
     }
     network 24.30.1.0/24 {
     }
     network 24.31.140.0/24 {
     }
     parameters {
         router-id 192.168.199.5
     }
     redistribute {
         ospf {
         }
     }
 }
[edit]
vyos@spk-rtr-vyos1# show protocols ospf
 area 0.0.0.0 {
     network 24.30.0.0/15
 }
 parameters {
     abr-type cisco
     router-id 5.0.0.0
 }
 passive-interface eth0
[edit]

Can someone help me to understand if this is a bug in Vyos or a bad configuration somewhere?

The nexthop-self on the spoke resolves this problem:

vyos@spk-rtr-vyos1# show protocols bgp
 bgp 65000 {
     neighbor 192.168.199.4 {
         nexthop-self
         remote-as 65000
         update-source 192.168.199.5
     }
     network 24.30.1.0/24 {
     }
     network 24.31.140.0/24 {
     }
     parameters {
         router-id 192.168.199.5
     }
     redistribute {
         ospf {
         }
     }
 }

vyos@wrw-rtr-vyos1:~$ show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF,
       I - ISIS, B - BGP, > - selected route, * - FIB route

S>* 0.0.0.0/0 [1/0] via 10.170.191.254, eth0
C>* 4.0.0.0/32 is directly connected, lo
O   5.8.128.0/24 [110/10] is directly connected, eth1, 02w1d23h
C>* 5.8.128.0/24 is directly connected, eth1
O   5.144.48.0/24 [110/10] is directly connected, eth1, 02w1d23h
C>* 5.144.48.0/24 is directly connected, eth1
O   5.149.96.0/24 [110/10] is directly connected, eth2, 02w1d23h
C>* 5.149.96.0/24 is directly connected, eth2
O   5.200.96.0/24 [110/10] is directly connected, eth2, 02w1d23h
C>* 5.200.96.0/24 is directly connected, eth2
B>* 10.5.10.0/24 [200/110] via 192.168.199.5, tun0, 00:02:25
B>* 10.5.10.100/32 [200/210] via 192.168.199.5, tun0, 00:02:25
B>* 10.5.20.0/24 [200/210] via 192.168.199.5, tun0, 00:02:25
S>* 10.154.0.0/16 [1/0] via 10.170.191.254, eth0
S>* 10.170.0.0/16 [1/0] via 10.170.191.254, eth0
C>* 10.170.128.0/18 is directly connected, eth0
S>* 10.200.0.0/16 [1/0] via 10.170.191.254, eth0
S>* 10.208.0.0/16 [1/0] via 10.170.191.254, eth0
S>* 10.232.0.0/16 [1/0] via 10.170.191.254, eth0
S>* 10.248.0.0/16 [1/0] via 10.170.191.254, eth0
B>* 24.30.1.0/24 [200/1] via 192.168.199.5, tun0, 00:02:25
B>* 24.31.140.0/24 [200/1] via 192.168.199.5, tun0, 00:02:25
C>* 127.0.0.0/8 is directly connected, lo
S>* 172.18.0.0/16 [1/0] via 10.170.191.254, eth0
S>* 172.24.0.0/16 [1/0] via 10.170.191.254, eth0
S>* 192.168.163.0/24 [1/0] via 10.170.191.254, eth0
C>* 192.168.199.0/24 is directly connected, tun0