Latest rolling is broken - 1.4-rolling-202105020417

Booted fine, no errors in the console but just not working.

Couldn’t ping or ssh to the machine. Had to revert to 1.4-rolling-202104300710 to get things back up and running again.

Hello @phillipmcmahon , did VyOS receive the default route via DHCP?
Do you have a change to get routes table?
show ip route

Being totally honest I didn’t take the time, I needed to get online so selected a known working version on reboot. Deleted 1.4-rolling-202104300710 once in.

Considering there was no WAN access and SSH access was also broken it seems something pretty fundamental.

It seems like know bug with deleting the default route when some static routes configured.

Yep, confirmed as still broken in the latest rolling release as well.

Here is my routing table. Apologies, it’s a screenshot from VMRC :confused:

I think something similar was seen before and linked back to an issue with FRR.

I see no difference with my routing table on VyOS 1.4-rolling-202104300710 that is working.

    phillipmcmahon@myrouter:~$ show ip route
    Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup

S>* 0.0.0.0/0 [1/0] via 172.31.255.5, eth0, weight 1, 00:06:56
S>* 10.0.0.0/8 [1/0] unreachable (blackhole), weight 1, 00:06:56
C>* 10.0.10.0/24 is directly connected, wg8, 00:07:01
C>* 10.67.134.85/32 is directly connected, wg0, 00:06:59
C>* 10.67.136.97/32 is directly connected, wg1, 00:06:58
S>* 172.16.0.0/12 [1/0] unreachable (blackhole), weight 1, 00:06:56
C>* 172.31.255.4/30 is directly connected, eth0, 00:07:04
S>* 192.168.0.0/16 [1/0] unreachable (blackhole), weight 1, 00:06:56
C>* 192.168.32.0/24 is directly connected, wg7, 00:06:57
C>* 192.168.68.0/24 is directly connected, eth1, 00:07:04
C>* 192.168.100.0/24 is directly connected, eth2, 00:07:05
C>* 192.168.110.0/24 is directly connected, eth3, 00:07:04

Do you ping default gateway?

Pinging gateways or external IPs does not work. It’s my main router at home, so I am back to a working version for now. Happy to flip a broken rolling release on and do some more diagnostics for you, just let me know what.

It seems interface naming was mixing.
eth0 ==> eth1
eth1 ==> eth4
eth2 ==> eth0

Check mac-addresses before update
Then
Enter in configuration mode and re-check mac-addresses again:

show interface

sudo tcpdump -ni eth0

Doesn’t look like the ethernet interfaces are being mixed up. MAC addresses are the same in the working and the broken rolling release versions.

tcpdump on eth0 show repeated ARP messages asking who has 172.31.255.5 which is the next hop to my fiber modem. same is showing on eth1 as well. eth2 which is actually an interface to a vmware mapped VLAN and should be silent shows a whole host of what looks to be LAN traffic. So while the config file shows the same MAC address per interface it does look like under the covers interfaces are being mixed up.

The latest rolling image still same issues, do you know the issue yet or can I help with providing some more info?

Is there progress on this issue, and/or do you need any more information to help?

Still no joy with the latest batch of rolling releases, “stuck” on VyOS 1.4-rolling-202104300710.

Anything you need in terms of data/diagnostics then please let me know.

What platform and network devices are you running?

I’m seeing the same issue and the only thing I am noting is during startup, my devices are not being renamed, in my case from eth1 > eth4. The devices/mac addr have not changed but for whatever reason they don’t get renamed during boot and thus configs fail to apply.

I am using ESXi running on a Protecli FW device. Nothing exotic and has been working fine for a long time.

I checked a working build and a new non-working build and didn’t see the device renaming that you mentioned. I did this by checking the config.boot file, and all looked identical. Is that how you saw this renaming happen?

Hello @phillipmcmahon, if you have the correct interface binding and I see you also have default route in your routing table, so this behavior looks very strange. Maybe WG doing some strange thing. In any case, this does not possible to reproduce without your configuration.

See attached.

Working is 1.4-rolling-202104300710
Broken is 1.4-rolling-202105101243

No differences at all in config.

Edit: For some reason when trying to add the multiple different files, it won’t do it correctly and keeps adding the same file multiple times.

config-working: firewall { all-ping enable broadcast-ping enable config-trap disa - Pastebin.com
config-broken: firewall { all-ping enable broadcast-ping enable config-trap disa - Pastebin.com

@phillipmcmahon could you provide an output when booted from broken image

sudo ip -d link show

There you go, executed on 1.4-rolling-202105101243.

iplinkshow.txt (2.7 KB) iproute.txt (1020 Bytes)

Yes, I see the problem. It looks something happens with hw-id in config
Try to delete it and reboot the router

configure
delete interfaces ethernet eth0 hw-id
delete interfaces ethernet eth1 hw-id
delete interfaces ethernet eth2 hw-id
delete interfaces ethernet eth3 hw-id
commit
save
run reboot now