I have a Mellanox SN2010 switch in my hands (4x QSFP28, 18xSFP28, with an Intel C2558 CPU and 8 GB of RAM) that I’d like to use with VyOS. If you’re not familiar with Mellanox’s SNxxxx switches, they’re mostly controlled by the mlxsw module in the stock kernel, which has been part of VyOS for a while. Out of the box, it’ll boot recent VyOS nightly builds, update the switch firmware if needed, discover all of the ports, and mostly work right.
Thanks to switchdev, there shouldn’t be any real support needed for L2/L3 configs in VyOS. The mlxsw module copies generic Linux L2 and L3 configs into the switch ASIC and then propagates hardware counters back into the kernel. At least in theory, it should be perfectly possible to set up multiple bridges, VLANs, routed interfaces, and even VxLAN via VyOS’s UI and get hardware offloading with no changes to VyOS at all.
The reason that I’m posting this under “development” is that there are a handful of little things that need to change to actually get a good out-of-the-box experience with these switches (or really any switchdev switch, which includes some OpenWRT-ish ARM systems as well now). I’m willing to do most of the work on these, but I figure it’s better to discuss them here before I start in and get patches rejected.
Issues:
- Naming the switch ports as
eth0…ethNisn’t great when (a) the physical port has a label on it and (b) the kernel knows what the label says for each port. It’s trivial to rename the interfaces via udev rules, but that rapidly runs afoul of VyOS’s name validators (here and elsewhere). For now, I’m renaming my ports asen$attr{phys_port_name}; sincephys_port_nameisp1…pN, that producesenp1…enpNinterfaces which VyOS allows, but it doesn’t really feel great, even if it is more or less whatenpinterface names are supposed to mean. - Splitting ports (QSFP→4x SFP) is more difficult. First, it really breaks port name validators, because splitting a QSFP named
enp14 ways will produceenp1s0…enp1s3. Next, there’s no real mechanism that I can see in VyOS for managing port splits like this. The kernel has generic support viadevlink port split(manpage), but I’m not sure what the right config in VyOS would even look like for that. Logically, you’d want something likeinterface ethernet enp1 / port split 4, but then that would make the kernel remove theenp1interface and replace it withenp1s0and friends. So then there wouldn’t actually be anenp1interface, but its config would be critical to the system’s operation. That feels wrong. - There are also a number of generic features that the switch supports but VyOS doesn’t today, like PTP. From what I can see, adding
linuxptpand configuring it just like any other PTP-supporting interface should work. - Finally, there are minor issues around fan control and environment support, but these are all generic Linux issues. Manually creating
/etc/fancontroland runningsystemctl enable fancontrolhandles most of them, but it’d be nice to have a UI for it.
So, I have a few questions:
- Does VyOS (in principle, at least) have an interface naming scheme that works for switch ports other than
ethN? Trying to mapeth21into a specific physical port on the front is much more difficult than having an interface name that includes the port name that’s silk-screened onto the switch. For people using these switches with generic Debian, I’ve seen people useswpN, while I’ve been usingenpN. Themlxswdriver provides a name that starts withptoday, but that’s probably not guaranteed with other switchdev devices. If I addedsw[0-9a-z]+to the validation list and a single udev rule to mapmlxsw_spectrumdevices intosw*, would that be acceptable in principle? - Does anyone have a suggestion on how
devlink port splitshould be configured? IIRC the same interface should work formlxsw, Intel E8xx-family NICs, and probably Mellanox ConnectX-8+ NICs. The issue (as I see it) is that when splitting an interface then that interface name will vanish, to be replaced by 2 (or more) replacement names. It looks like Juniper mostly puts this sort of config intochassis fpc <slot>, so that might be a precedent for putting it somewhere other thaninterface. - Does anyone have any objections in principle to me adding config support for
fancontrol? - Does anyone have any issues with PTP? I’d need to pull in the Debian
linuxptppackage and figure out how to best map it intointerfaceand elsewhere. The biggest issue with PTP is that the underlying config varies depending on which ports are involved and if they share a PTP Hardware Clock or not. Some multi-port NICs share a single PHC across all ports, while others have a PHC per port, so getting this right may be a bit tricky.