Mellanox 100G QSFP+ Install Error

Hi,

When installing the hardware with Mellanox 100G and Intel 10G PCI card, we get the error in the attachment. What could be the reason? The version is 1.4.

Our hardware features are as follows.

2x E5-2670v2
64GB Ram
120GB SSD

HPE DL380p Gen8

I’ve been running VyOS with mix of ConnectX 4/5 25g/100g cards and never encountered a panic at boot.
Does the machine boot without the card installed? If so I’d try to install the latest firmware to rule out issues there.

We cannot remove the cards in our current device. I just installed it with version 1.5 but it does not see the 10G ports. For this, a driver needs to be installed, but since I do not know, I cannot do this.

Then I’d try booting a Live Debian to see what is going on.
If the OS can boot you can upgrade the firmware from it, test drivers etc.

Can you help with this matter?

@huseyintr27 It was a good suggestion to detect the problem :wink:
So if it works correctly in Debian but not in VyOS we should find the solution, but if it fails on clean Debian the solution will be a little more difficult to find

At first glance, the firmfire on the NIC should be updated. But still not clear.

1 Like

@huseyintr27 I did a quick check, ERST is part of ACPI so it rather seems to be an issue with the server firmware than the NICs. If you google the error message you find some references in the HPE forums.
You could try upgrade/reset the servers Firmware and BIOS.

When I installed version 1.5.X it opened but this time it sees the cards as missing.

Is it visible with lspci?
If so I’d boot up a Debian Live system and install the Mellanox mft package to upgrade the firmware of the NIC.

vyos@vyos:~$ lspci | grep Mellanox
21:00.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4]
21:00.1 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4]

It gave a result like this. Actually, it sees 1.5.x and the Emulex 10G pci card is installed.