LAN connection fails when restarting switch

Hi,

I just replaced my main switch and during testing I noticed that after a switch restart the connection with my router is lost. This behaviour is consistent, I performed multiple restarts - the only way to reestablish connection is to hard reset the router.

I use VyOS version 1.5-rolling-202409250007 on bare metal, an AMD Ryzen Embedded V1500B CPU, and the switch is connected to the router with a DAC (the cable is new).

A bit more context. This is in a home environment and I have limited networking background. On top of that, I recently replaced my router software with VyOS, which went live a few days ago, with migration of the old router settings to VyOS taking about a week prior to that, so not much experience here either.

I don’t know if the problem is really on the router, or on the switch/cable, and I need some help to troubleshoot it. Or could this be caused by a missing configuration on the router’s LAN interface?

What kind of vendor and model are the nics in your VyOS router and what model/vendor is that switch?

Also try latest nightly just to rule things out.

The NIC is AMD XGMAC 10GbE Controller (from lspci; kernel module I think it is amd_xgbe) and the switch is UniFi USW Enterprise 8 POE.

I will try nightly, but it will take some time…

Anything standing out in the kernel logs for amd_xgbe?
Have you tried fixed speed/autoneg settings?

I performed another test, this time logged into console, and it looks like the link continuously switches from Up to Down on this interface.

Oct  7 12:58:27 vyos kernel: [  348.532418] amd-xgbe 0000:06:00.1 eth3: Link is Up - 10Gbps/Full - flow control rx/tx
Oct  7 12:58:27 vyos netplugd[983]: eth3: state INSANE flags 0x00001003 UP,BROADCAST,MULTICAST -> 0x00011043 UP,BROADCAST,RUNNING,MULTICAST,10000
Oct  7 12:58:38 vyos kernel: [  359.797536] amd-xgbe 0000:06:00.1 eth3: Link is Down
Oct  7 12:58:38 vyos netplugd[983]: eth3: state INSANE flags 0x00011043 UP,BROADCAST,RUNNING,MULTICAST,10000 -> 0x00001003 UP,BROADCAST,MULTICAST
Oct  7 12:58:43 vyos kernel: [  363.893999] amd-xgbe 0000:06:00.1 eth3: Link is Up - 10Gbps/Full - flow control rx/tx
Oct  7 12:58:43 vyos netplugd[983]: eth3: state INSANE flags 0x00001003 UP,BROADCAST,MULTICAST -> 0x00011043 UP,BROADCAST,RUNNING,MULTICAST,10000
Oct  7 12:58:54 vyos kernel: [  375.158940] amd-xgbe 0000:06:00.1 eth3: Link is Down
Oct  7 12:58:54 vyos netplugd[983]: eth3: state INSANE flags 0x00011043 UP,BROADCAST,RUNNING,MULTICAST,10000 -> 0x00001003 UP,BROADCAST,MULTICAST
Oct  7 12:58:58 vyos kernel: [  379.255486] amd-xgbe 0000:06:00.1 eth3: Link is Up - 10Gbps/Full - flow control rx/tx
[...]
Oct  7 13:01:24 vyos systemd[1]: opt-vyatta-config-tmp-new_config_4028.mount: Deactivated successfully.
Oct  7 13:01:27 vyos systemd-logind[900]: The system will reboot now!
Oct  7 13:01:27 vyos systemd-logind[900]: System is rebooting.

I looked into auto negotiation before my initial post, but it seems fibre does not allow it, from what I found online at least.
On the switch is set to default - Automatically Negotiate, with other auto-negotiate options being 10Gbps FDX and 1Gbps FDX. I already tried with 10Gbps FDX set manually on the switch and I had the same result.
On the router is set to off. I did not change it. In fact, the only settings I applied on this interface are the description and the address.
The listing from ethtool is:

Settings for eth3:
        Supported ports: [ TP ]
        Supported link modes:   10000baseCR/Full
        Supported pause frame use: No
        Supports auto-negotiation: No
        Supported FEC modes: Not reported
        Advertised link modes:  10000baseCR/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: No
        Advertised FEC modes: Not reported
        Speed: 10000Mb/s
        Duplex: Full
        Auto-negotiation: off
        Port: None
        PHYAD: 0
        Transceiver: internal
        Current message level: 0x00000034 (52)
                               link ifdown ifup
        Link detected: yes

After a switch restart Link detected changes from yes to no continuously.

Edit: Disconnecting and reconnecting the cable on either end after a switch restart also has no effect.

I did more research and seems this INSANE state is triggered by a bug in netplugd which apparently was fixed in 1.2.9.2-4, but if I am reading Debian package tracker correctly it is not in stable yet.

Could this be the case here?

The issue was fixed after I changed the cable.

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.