Nic Intel X520-da (dual 10G nic) Admin/Down after VyOS upgrade

I was running rolling ver 1.4-rolling-202108210117 and tryed to upgrade to latest 1.4-rolling-202109041050. After VyOS rebooted, no 10G interfaces where avaiable.

with sh interfaces they apear with status A/D
At the Cisco switch, the port connecting to the Nic-ports, apears up

At “command prompt” an ifconfig doesn’t list them

I tryied powering off the server and rebooting it, but same simtom.

Starting the Dell-R610 with the previous(= 1.4-rolling-202108210117) version, the interfaces came back.

I do not know how to debugg bether, find driver complains, or some command to force the interfaces start up.

Issuing: delet interfaces ethernet eth4 disable
was not accepted.

at my configuration I needed to disable flow-conrol (set interfaces ethernet eth4 disable-flow-control), due to vyos complains configuring the interface and found topics telling to do so.

try adding "set interfaces ethernet eth4" then commit

admin@border.router# set interfaces ethernet eth4
admin@border.router# set interfaces ethernet eth5
admin@border.router# commit
Could not set flowcontrol for eth4
[ interfaces ethernet eth4 ]
Could not set flowcontrol for eth4
VyOS had an issue completing a command.

We are sorry that you encountered a problem while using VyOS.
There are a few things you can do to help us (and yourself):

When reporting problems, please include as much information as possible:

  • do not obfuscate any data (feel free to contact us privately if your
    business policy requires it)
  • and include all the information presented below

Report Time: 2021-09-05 21:43:33
Image Version: VyOS 1.4-rolling-202109050613
Release Train: sagitta

Built by: autobuild@vyos.net
Built on: Sun 05 Sep 2021 06:13 UTC
Build UUID: 47262bc3-52a1-42d6-a52c-87c017a72ba6
Build Commit ID: 8b8a3ff535b347

Architecture: x86_64
Boot via: installed image
System type: bare metal

Hardware vendor: Dell Inc.
Hardware model: PowerEdge R610
Hardware S/N: 83KX9R1
Hardware UUID: 4c4c4544-0033-4b10-8058-b8c04f395231

Traceback (most recent call last):
File “/usr/libexec/vyos/conf_mode/interfaces-ethernet.py”, line 201, in
apply(c)
File “/usr/libexec/vyos/conf_mode/interfaces-ethernet.py”, line 190, in apply
e.update(ethernet)
File “/usr/lib/python3/dist-packages/vyos/ifconfig/ethernet.py”, line 351, in update
self.set_speed_duplex(speed, duplex)
File “/usr/lib/python3/dist-packages/vyos/ifconfig/ethernet.py”, line 169, in set_speed_duplex
cur_speed = read_file(f’/sys/class/net/{ifname}/speed’)
File “/usr/lib/python3/dist-packages/vyos/util.py”, line 198, in read_file
raise e
File “/usr/lib/python3/dist-packages/vyos/util.py”, line 193, in read_file
data = f.read().strip()
OSError: [Errno 22] Invalid argument

noteworthy:
cmd ‘ethtool --pause eth4 autoneg on tx on rx on’
returned (out):

returned (err):
rx unmodified, ignoring
tx unmodified, ignoring
Cannot set device pause parameters: Invalid argument

[[interfaces ethernet eth4]] failed
Could not set flowcontrol for eth5
[ interfaces ethernet eth5 ]
Could not set flowcontrol for eth5
VyOS had an issue completing a command.

We are sorry that you encountered a problem while using VyOS.
There are a few things you can do to help us (and yourself):

When reporting problems, please include as much information as possible:

  • do not obfuscate any data (feel free to contact us privately if your
    business policy requires it)
  • and include all the information presented below

Report Time: 2021-09-05 21:43:34
Image Version: VyOS 1.4-rolling-202109050613
Release Train: sagitta

Built by: autobuild@vyos.net
Built on: Sun 05 Sep 2021 06:13 UTC
Build UUID: 47262bc3-52a1-42d6-a52c-87c017a72ba6
Build Commit ID: 8b8a3ff535b347

Architecture: x86_64
Boot via: installed image
System type: bare metal

Hardware vendor: Dell Inc.
Hardware model: PowerEdge R610
Hardware S/N: 83KX9R1
Hardware UUID: 4c4c4544-0033-4b10-8058-b8c04f395231

Traceback (most recent call last):
File “/usr/libexec/vyos/conf_mode/interfaces-ethernet.py”, line 201, in
apply(c)
File “/usr/libexec/vyos/conf_mode/interfaces-ethernet.py”, line 190, in apply
e.update(ethernet)
File “/usr/lib/python3/dist-packages/vyos/ifconfig/ethernet.py”, line 351, in update
self.set_speed_duplex(speed, duplex)
File “/usr/lib/python3/dist-packages/vyos/ifconfig/ethernet.py”, line 169, in set_speed_duplex
cur_speed = read_file(f’/sys/class/net/{ifname}/speed’)
File “/usr/lib/python3/dist-packages/vyos/util.py”, line 198, in read_file
raise e
File “/usr/lib/python3/dist-packages/vyos/util.py”, line 193, in read_file
data = f.read().strip()
OSError: [Errno 22] Invalid argument

noteworthy:
cmd ‘ethtool --pause eth5 autoneg on tx on rx on’
returned (out):

returned (err):
rx unmodified, ignoring
tx unmodified, ignoring
Cannot set device pause parameters: Invalid argument

[[interfaces ethernet eth5]] failed
Commit failed

After booting vyos-1.4-rolling-20210905 the same issue happens.

The NICs from Intel X520-da where in A/D state.

But I could issue

sudo -s
# ifconfig eth4 up
# ifconfig eth5 up
# exit
$ configure
# load config.boot
# commit

and my two NICs are working again.

So, for some reasone I d not know, the NICs couldn’t be activated (ifconfig ethX up) during the boot process. But if I do that manualy and re-load config.boot thinks look almost normal.

During boot I saw a message:
Waiting for NICs to settle down: settled in 1s

Mayby I need to introduz an delay at boot (do not know how) to make NICs come up before config.boot is executed

Any sugestion ?

Thanks

Another data point… I’m having the same issue between 1.4-rolling-202106260417 and 1.4-rolling-202109081258.

My wrinkle is that I have 2 NICS, eth0 is an onboard I219, eth1 is an X520 (not sure re DA or SR, will have to have to open it up to check). Error report attached (sorry for screenshot, I can’t copy-paste out of my hypervisor tools while NICs are down) but error is “Could not determine auto-negotiation settings for interface eth0!”

I would possibly expect that if the error was eth1, because I’m using a 3rd party SFP module and connecting the x520 to a 1GB switch, that they can apparently have issues with… but it is odd that it is having an issue with the I219 before it even gets to trying to load the config for the x520.

I’ve got a couple of other machines that I can test on and see what they come up with, all of them have x520-da or x520-sr cards in them. Just can’t do it without being on site, for fairly obvious reasons.

try to downgrade image

set system image default-boot your-old-system-image

and reboot after this process

your interfaces will come back

I’ve gone through and done a day-by-day upgrade of the rolling releases, the problem is introduced in vyos-1.4-rolling-202109010430.

Looking at the changelog… there isn’t a changelog for that night. Either it’s an upstream issue - eg kernel drivers (possibly related: this thread) or there was an undocumented change that night that might have upset something?

I don’t understand any of the code, but there are 2 commits around then that sound like they may have something to do with this:

I’ve just tried doing a clean install with the latest nightly on my setup (VM inside xcp-ng) with the same result - the live OS version can’t bring up the interfaces even before anything is configured.

Looks to have been fixed, probably this ticket (T3874)