What makes VyOS drop packets when it shouldn't?

senseivita · May 2, 2024, 7:31pm

The other day, this popped in my head:

I make drawings of deployments. This was from a setup I tried long back to put an old Ubiquiti gateway do something not just gather dust. Internal routing happens closest to the traffic, i.e. never leaving the hypervisor, or in switches, and my uplink is only 1Gbit/s; performance-wise, I had nothing to lose as long as I’d figured out how to disable NAT, which I did.

But soon after I started getting dropped traffic alerts from the routers around it; they ping their gateways and it, the Ubiquiti device, was failing to respond every few iterations. Normally they have tolerance for this, but when the RTT is sub-millisecond on a perfect physical point-to-point link, there’s no tolerance anymore.

What does that have to do with VyOS? Well, it’s because time after time I’ve tried to deploy it it sort of behaves the same: it performs exceptionally with minimal resources, almost the same as OpenWRT or Mikrotik, but unlike the latter two, occasionally they both drop traffic and from what I understand, both — Ubiquiti’s firmware and VyOS — have the same roots; this ancient 1930’s router called Vyatta or something. Y’know? Those that had a little chamber for the wood to make the steam… I don’t know the logistics of it but it makes sense because Vyatta sounds like a boat it’s basically science. point is, after science, but also that maybe they’re related in more that the fork between devs that brought them into the world.

Ubiquiti’s software is so locked down and poorly documented that’s hard to venture in it enough to misconfigure it, so it was only echo replies how it manifested. Of course, when you average them on a perfect link, it’s quickly misinterpreted as an unreachable gateway and then the traffic stops, but VyOS doesn’t have those guards/restrictions. So what would happen in addition of losing echos replies every 10-12 or so, also traffic would be dropped without rules for it to be dropped.

Given the permissive ruleset, there was only one thing that could block it, the DNS firewalls, but they were being resolved correctly. It was VyOS what dropped the traffic on its own. The same generic rule for the rest of the traffic should cover this site and a couple more that were also blocked. With an old cellular USB modem, a Thunderbolt NIC, static routes, and Little Snitch for good measure, I confirmed DNS was fine by forcing queries thru the local network while the rest of the traffic used the cellular connection.

It should work, it works in other platforms, then again if two devices had the same behavior, it’s two against one, therefore it’s probably more likely I did something wrong. But I’ve tried VyOS baremetal, in Hyper-V, in vSphere, in KVM, even Unraid at one point, using subinterfaces on an ethernet interface directly and subinterfaces on a bridge with an ethernet interface as port and bridge VLAN filtering (DSA on VyOS), I tried with paravirtualized NICs, virtualized-, emulated NICs, PCI passthrough ethernet and SPF+ cages, copper, twinax and active fiber cables, between VyOS instances over tunnels across the world and in the same hypervisor, with Hyper-V’s SET networking, with the old teaming thing you access only through Server Manager (see below), using DSA on the host, and macvtap, and SRIOV functions and even vSphere useless DirectPath I/O, at a point I was like just fork me already and get on with it.

It happens on 1.4 and 1.5, I used earlier versions but I was still too ignorant to make something out of it. I still have one of the VyOS VMs, the latest, which I think might still be a drop-in replacement for the network edge.

At some point I finally figured out how that I could do a remote capture with Wireshark, not that it mattered because I have a rudimentary-at-best knowledge of it. I only know how to loop for packet, I don’t know if what I’m seeing is congestion or of there’s a movie playing where the frames got too slow the the lines flash by with a nostalgic CRT filter. It’s definitely a lot of red and sometimes black, which is not good because I think those are retransmissions, or TCP RSTs, but might as well be a bloody scene of the film because I’d know just as much how to proceed. (Historically, that’d be to lick the wound. )

Is there like a spoken thing “in-the-know” that must be configured on VyOS, or something like that?? Y’know…like they did on Windows and Microsoft Technet when that was a thing. They’d tell matter-of-factly something like:

“Everybody” knows one must connect through other interface if they’re going to change an interface’s address on Windows, because on the first click on OK it loses the gateway and disconnects you.

As if it made all the sense in the world. Or as some rite of passage, I don’t know, I’m rambling. But if there is, I’ll do it. I’ll do the ritual, in a kimono if necessary. I really want to make it work. I don’t know where am I going to get a kimono though, and I’ll definitely need a katana…

Thanks! Teriyaki

tjh · May 3, 2024, 8:11pm

This is a wonderful bit of prose, very flowing and full of all sorts of interesting little tidbits.

However, as a bug report, it’s one of the most useless things I’ve ever read (Sorry to be so blunt) - That said I’m not even sure after reading it a few times if it’s a bug report or just a story.

What’s the actual problem? I assume from the subject you’re seeing packet loss.
Where is the packet loss? It’s clear from your words you’ve got an understanding of whatever topologies you’ve tried, but that doesn’t translate into something clear/concise.

I would kindly suggest you start again with your bug report. Here’s a few things to think about to post

Could you can make a simple diagram that shows where the packet loss is?
How much packet loss there is
What are the step(s) you use to reproduce it.
Is the hardware physical? Virtual?
Have you ruled out any switching infrastructure in between?
What does show interfaces ethernet <interface> detail show you, are any dropped packets listed there?
Is there anything in the logs? Does “sudo dmesg” give any indications?
Have you tried turning on/off any of the Ethernet Offloads (assuming a bare metal box)
Do you have any QoS configured?
Is the loss on your local LAN/WAN etc?

I hope this doesn’t discourage you - that’s not my intention. But please keep reports/questions clear and concise - It can be hard enough to keep someone else’s network topology in your head to try and think through, let alone having to read a massive wall of text to try and even figure out what that topology might be.

senseivita · July 11, 2024, 3:19am

@tjh

20240705F@⪸AFTERNOON

No problem. If I’m overly verbose it’s because I know I lack the knowledge to explain/identify or identify a number of things, so it can be inferred from context what I failed to mention because I may have thought it wasn’t important. I’m not trying intentionally to waste anybody’s time. I’ll try to yap less though it’s kinda complicated when details are key. For what it’s worth, these posts take a long time, lots of edits, and a good 80% of the time never get sent at all.

Back to the problem, since the last time I had already moved on; and deleted the virtual machine too. {answer#4…✓. 9+}

Fortunately, I had made it a habit of saving the output of show configuration commands, specially about to jump ship. It shouldn’t take long to recreate the VM and fit it right where it was, even faster if I’d turned on vCenter to use VM templates but I’m taking another direction this time, it’s not nonsense, I promise. VyOS is on the only point on the network where there is no redundancy, so it is easier to pinpoint when something in that spot is the issue. {a#5…✓.}

I’m still not done but I think I’ve collected enough info to start and keep posting during the day.

{speaking of updates…}

TLDR

I found the problem, I’m still leaving everything for context which was meant as quick updates, until I forgot several in a series so it changed to sort of a log format. Just scrool way down.

Network’s Own Burden on the Firewall

VyOS connects public and private networks, it’s the edge. It is not needed for anything in the intranet unless it involves the Internet, the network is otherwise autonomous; DHCP, DNS, routing, NTP, reverse proxies, internal NAT, all that, all there.

The interfaces are 10Gbit/s adapters, from that bandwidth at the most a tenth can be ever utilized because it’s what my ISP offers, a single gig uplink, and it doesn’t have any other bandwidth intensive task. I know bandwidth isn’t the sole metric to worry about, but with utilization of a tenth from the total even if it was a black market counterfeit like fake iPhones I’m confident there must be resources to spare in the adapter, MBUFs or whatever these are.

These aren’t fake iPhones though, they are server-grade cards I got for these tests. They must be or vSphere wouldn’t recognize them.

Since I’ve pinpointed how the problem manifests; duckduckgo.com won’t load among other sites. That is what I tested on five different firewalls. Same everything, so none of them would have an advantage over the other.

Using the same VM, I’d only swap out the disk.

Access to the physical NIC was exclusive to the VM. In virtualized form, the NIC was put in its own virtual switch with a single trunk port group, connecting only a single VM. The other methods I used to set the adapter were through SR-IOV and PCI passthrough.

Topology

Directly Physically Attached Network, Interfaces

              eth0
                │     ┌──────┐                      
               br0 ┬──┤ VyOS ├──┬ eth1              
           vif190 ┬┘  └──────┘  └┬ vif881           
 (of:/24) 1xIPv4 ─┘              └┬ pppoe0          
                                  └─ 1xIPv4 (of:/32)

Directly Attached Networks, Logical

                     ┌──────┐  
                     │      ├ ─ 1x 6in4 Hurricane Electric IPv6/64(128) transit        
                     │ VyOS │      (L3tun for for routed IPv6/48 block)               
                     │      ├ ─ 1x OSPF-ECMP link: 2xIPv4/31 /L3tuns
1x IPv4/24 subnet  ──┤      ├── 1x PPP IPv4/32, IPv6 in PPP disabled
1x IPv6/120 subnet   └──────┘  

             ┌────────────────────────┐
             │No hosts can be directly│
             │be  (other than routers)│ 
             │ reached by the router. │
             └────────────────────────┘

or

JSON… it is JSON, right? Whatever it is.

 bridge br0 {
     enable-vlan
     member {
         interface eth0 {
             allowed-vlan 1-4094
             native-vlan 1
         }
     }
     vif 9 {
         address 2001:470:8085:900::1/120
     }
     vif 190 {
         address 10.190.0.1/24
         description zbe00
     }
 }
 ethernet eth0 {
     hw-id 00:50:56:be:00:01
     offload {
         gro
         gso
         sg
         tso
     }
 }
 ethernet eth1 {
     hw-id 00:50:56:bf:ee:01
     offload {
         gro
         gso
         sg
         tso
     }
# C  vif 191 {
# U      address dhcp
# R      dhcp-options {
# R          client-id routelogic
# E          host-name routelogic
# N      }
# T  }
     vif 881 {
     }
 }
 pppoe pppoe0 {
     authentication {
         password "somepassword"
         username "tmx34577355743@prodigyorsomething"
     }
     no-peer-dns
     source-interface "eth1.881"
 }
 loopback lo {
 }
 tunnel tun0 {
     address 2001:db8:005:ab1::2/64
     description wzHE
     encapsulation sit
     remote 72.52.104.74
     source-address 192.168.191.2
 }
 wireguard wg1 {
     address 192.168.91.1/31
     peer cloudfront1 {
         address X.X.X.X
         allowed-ips 0.0.0.0/0
         persistent-keepalive 12
         port 4491
         public-key XXX
     }
     private-key XXX
 }
 wireguard wg2 {
     address 192.168.92.1/31
     peer cloudfront2 {
         address X.X.X.X
         allowed-ips 0.0.0.0/0
         persistent-keepalive 8
         port 4492
         public-key XXX
     }
     private-key XXX
 }

20240706S@≈NOON

I forgot to send the draft I had written, got sidetracked all night making diagrams. I’m starting testing. Check back soon.

20240707U@≈EVENING

Forgot again to post the thing. But I see zero improvements, I just thought on one last thing to try that I tried elsewhere for reasons but I’M too sleep deprived.

I’ll just [loosely] timestamp this like a log of someone lost at sea — VyOS does sound like something maritime — and send it all at once when I’m done. I’m starting to grow a beard so it’s fitting.

{No longer important, NEEDS CLEANUP B4 POST}

1 = 2 …and 3, and 4, and 5 (Results)

The only differences between platforms were:

Mikrotik’s CHR can only use IDE disk controllers, boots BIOS.
OpenWRT is deployed using a temporary VM, can boot in either BIOS or EFI. Like VyOS, it supports VMware’s paravirtual SCSI controller fully. Thus it would become VyOS next.
VyOS’ installer can overwrite the disk and needs zero changes from OpenWRT before it.
The BSDs — at the forefront of ZFS development — have so. many, disk. related. issues. it’s not funny. No wonder they had to adopt such a resilient storage format. Knocked down the SCSI controller to an old thing and it was back on business. Between them is like switching back and forth from OpenWRT and VyOS, no changes are needed on the VM at all.

All changes related to storage. None to networking. So this seems largely inconsequential/irrelevant but I’m just being thorough.

Overall, they all yielded the same results as the standard conditions would afford.

That said, all of the other platforms tested don’t have this traffic-dropping problem neither in standard or these controlled-ish conditions, but the problems they may have of their own remained there. Which I took as sign of consistency and more or less as validation of the tests, as unscientific as this all may be.

Examples/Observations

The senses, OPNsense and pfSense, can route the full gigabit/s uplink if something else handles NAT before traffic passed through them and as long as their ruleset is limited by a single catch-all rule or something just as complicated (or lackthereof).

This information was key to fix the problem.

The minute NAT or a standard interface[-and-not-host]-based ruleset is added, performance tanks. It’s reduced reduced to about half of the total bandwidth. It may be tuned and tweaked obsessively, and the VM could be made as big as a SharePoint farm but the improvements are barely noticeable.

Considering that:

CPU doesn’t spike as reported by the guest OS.
CPU doesn’t spike as reported by the hypervisor.
These machines were setup to run in RAM with no logging to eliminate storage as the bottleneck.
Migrated the VM to host with a much more powerful CPU, leave the same ridiculously large amount of cores and memory, and,
I got the same results each time. I thought it was a hardware issue. It should be corrected in the most favorable conditions they’ve (these OSes) ever been where they no longer have >20 interfaces but only a couple plus some basic 1-to-1 tunnels.

On the other hand, Linux-based routers, if this was a competition, would have completely obliterated the senses. They all exceeded 1G. I have no idea how because the interface (on the ONT) is supposed to be 1G max. Maybe overhead calculations, or the bursty nature of it since it was just by .1 and only briefly before stabilizing in 1G, IDK. I do know that that suggests that processor and networking hardware are able to handle a single gigabit per second of data transfer, no surprise there.

All Linux routers were zippy and frugal in resource-consumption, as usual.

In OpenWRT, I’ve noticed in the past that occasionally it would leak (allow) traffic through the firewall that’s explicitly set to be rejected. This happened during testing as too… eventually. Normally it takes a long time before it manifests itself, but I got “lucky” this time.

And then there’s VyOS. What it does, or rather what it doesn’t do, is load sites. But it’s specific for only some sites, others load fine.

duckduckgo.com in particular got my attention. I know that has some kind of relationship with Microsoft. Which I block access to at the ASN level and on DNS. so I didn’t thought much of it at first until I snapped out of it and remember the ASN-based layer 3 filter is not set up on VyOS. I only have the DNS filter available. I checked my resolvers to see if there was a block for duckduckgo.com but there wasn’t.

As ignorant as I am in Wireshark, I took a capture, and at the same time I started another from Little Snitch. I managed to clean up the one I got from Wireshark to the relevant packets but since Little Snitch takes its data from the Berkeley Packet Filter which as I understand does it very early in the upper layers before the data gets a network address therefore it’s all fake, and it was a little too much to clean up. If you’re in the mood though: the destination address duckduckgo.com is the one thing that’s correct. It’s the only one that starts with fifty-something. The other capture is like 10 packets only. Here are both:

Little Snitch: https://fetch.vitanetworks.link/static/embed/vyos/firefox-little-snitch.pcap
Wireshark: https://fetch.vitanetworks.link/static/embed/vyos/noise-trimmed.pcapng

`interfaces ethernet detail`

{6.}

Taken when the PPPoE interface still existed, the problem was occurring at that moment.

vyos@vyos# run show interfaces ethernet detail
eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master br0 state UP group default qlen 1000
    link/ether 00:50:56:be:00:01 brd ff:ff:ff:ff:ff:ff
    altname enp11s0
    altname ens192

    RX:       bytes  packets  errors  dropped  overrun       mcast
         1510980452  3153534       0      455        0      476201
    TX:       bytes  packets  errors  dropped  carrier  collisions
         1236465533   947764       0        0        0           0
eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:bf:ee:01 brd ff:ff:ff:ff:ff:ff
    altname enp19s0
    altname ens224
    inet6 fe80::250:56ff:febf:ee01/64 scope link
       valid_lft forever preferred_lft forever

    RX:       bytes  packets  errors  dropped  overrun       mcast
         1260670287  1051369       0      302        0         985
    TX:       bytes  packets  errors  dropped  carrier  collisions
           85748296   737956       0        0        0           0
eth1.881@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:50:56:bf:ee:01 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::250:56ff:febf:ee01/64 scope link
       valid_lft forever preferred_lft forever

    RX:       bytes  packets  errors  dropped  overrun       mcast
         1242138952   999217       0        0        0          55
    TX:       bytes  packets  errors  dropped  carrier  collisions
           85747470   737949       0        0        0           0

MAC addresses do not correspond to anything physical…or virtual. vCenter, where the range is from, hasn’t been booted up in a while, but they’re already mapped in DHCP, so…yeah.

20240708M@≈DAWN

Found more non-working domains; help.apple.com, hub.libreoffice.org, there’s another but I might get banned or at least reprimanded if I mention it, I got a few words censored already. =P

20240708M@≈AFTERNOON

OMFG, it worked! Pursuing the hunch I had paid off. It’s been around 3yr since I’ve been wanting to switch to VyOS. At the time of writing this sentence, the document is nine pages long though. I’m done testing but this might take another day.

I’m also just realizing the image I have of VyOS is the same I had at the original post. 6 months old.

# run show system image
Name                      Default boot    Running
------------------------  --------------  ---------
1.5-rolling-202401030023  Yes             Yes

$ show system image
Name                      Default boot    Running
------------------------  --------------  ---------
1.5-rolling-202407090020  Yes             Yes
1.5-rolling-202401030023

Ironically what started this, a UniFi gateway, is the only place where I got better results. In the one place where hardware cannot be improved, but thanks to VyOS, I unexpectedly learned to configure UniFi gateways from scratch without their Controller nonsense and when I realized this I went and I did just that.

I put this UniFi gateway, the first, cheapest and now discontinued model rocking half a gig of RAM, two 500MHz processor cores, and with my new found expertise, I got it routing and natting at full line speed, and doing dynamic routing which to date isn’t supported by UI, formerly Ubiquiti Networks, on their most allmightiest models, and no longer it dropping traffic.

But somehow using their controlled experienced Controller I can’t get those results or half the functionality which I was only able to unlock because of VyOS. If the software wasn’t so outdated, I’d very much stick with it.

20240708M@≈EVENING

Major effup in the update. This is never going to end.

20240710@NOON

On page 13. Outlook on the Web, AKA OWA, AKA Exchange’s webmail is now available from outside the network complete with the back and forth needed to log in through ADFS and pass through all the proxies and several layers deep in the intrastructure. It’s basically a success. My brain not so much, it’s a bit scattered, but I’ll want to finish already so here’s what I found:

Back on OpenWRT, one of its quirks is that it sees itself on the network and it would print it in the console over and over. I could use macvlans to work around it, but I’m already doing a switch, then sub-interface this and that, because of PPPoE. What if I move it thought?

…NAT it elsewhere and pass it down to OpenWRT…

I knew it had to be a Linux-based router, they’re the lightest and fastest. OpenWRT is out of the question because it would need two interfaces and the problem would repeat itself. Mikrotik’s CHR is good, but a I have only one license and it’s already busy. It would have to do more stuff to justify it and its current task is way too important to abandon. VyOS mysteriously drops traffic.

What about UniFI? I thought, followed quickly of remembering that it doesn’t do full cone NAT. That would be an issue. Just as fast I also remember that where I need to NAT full cone is on my static addresses which are routed-in in a tunnel already full cone natted and can bypass the UniFi gateway. And so I solved the issue. And when I started these tests this week, the UniFi gateway still had that config.

I reconfigured VyOS with static addressing on the WAN, gateway and all that, went to duckduckgo.com and it loaded the site. Finally!

There’s dynamic routing on the network, what if I missed something.

I double checked, all good. I reverted the configuration to see if I was not imagining things and confirm it would both failed again and succeed again. It did and it did.

So there it is.

NAT was the culprit, though now that I think about it, it could also be PPPoE, the things I abstracted from VyOS.

I updated the system since then, but I don’t feel like poking the bear right now, I just want to sleep.