Is this QoS policy a valid work around?

Hello,

VyOS 1.3.1, dual WAN with failover and SNAT, zone based firewalls, IPSEC VPN.

In attempting to learn to how properly setup QoS I read the support article and forum posts and opted for a shaper policy (with classes) for outbound and a shaper policy attached to an IFB for inbound. Feeling accomplished I ran some tests but everything seemed to be stuck in the default queue. After much self doubt, bashing of head against wall, bleary eyed reading (this was the 3rd 14 hour day in a row) I found in a forum post that said outbound classes couldn’t match the source IP when SNAT is used, it will only be able to match the translated outbound interface, a work around was to mark the packets and use that as a class matcher. Relief flooded me as it appeared logical so must work. I gave it a go and alas the result was the same, the default queue was always used. Here I said a prayer since I was at the end of my tether and stepped away for 30 minutes. Upon my return a forum post was found that indicated marking packets would’t work as a class match if dual WAN failover was in use (which it was) due to the order of how things are mangled (whatever that means). So I removed the failover config and the QoS classes started working as expected. Great, but I need dual WAN failover…

In the future my standard will be to have LAN gateway where all internal zones are firewalled, routed and it will have one external interface without NAT, QoS for Internet based purposes will apply the external interface. Then a WAN gateway will be beyond that for dual WAN failover and SNAT, no QoS.

Now that the background is out of the way, to the original question. Since we have only 1 router and I need SNAT, WAN failover and QoS I’ve opted to use just one queue for WAN outbound and to use a shaper for LAN outbound instead of WAN-IN via IFB. I’d like to know if this a valid way to implement this, or if there is a more acceptable method.

2 WAN interfaces on ETH0 and ETH1 (600Mbps down and 35Mbps up on primary WAN)
bond0 with 8 vif on ETH2-5 (4x 1Gbps)

Since QoS on bond0 has a total of 4Gbps I’m using this as the default ceiling value so as to not limit speeds on traffic between vifs. Then using the WAN download Gbps divided by bond0 Gbps = % used for ceiling on classes.

Some of our vendors require a guaranteed amount of bandwidth for their subnet which is why we need to implement a shaping policy for WAN-IN (which is LAN-OUT).

A network group was created with the RFC1918 IP addresses and this was used to mark WAN packets that were not RFC1918

set policy route NOT-RFC1918-IP rule 10 description "Mark packets NOT from RFC1918-IP"
set policy route NOT-RFC1918-IP rule 10 source group network-group !RFC1918-IP
set policy route NOT-RFC1918-IP rule 10 set mark 19217210
set interfaces ethernet eth0 policy route NOT-RFC1918-IP

It’s interesting that using “!” doesn’t show when using tab to see the autocomplete possibilities when creating the policy route, but it accepted it and it seems to work.

Then the LAN-OUT shaper policy was created, packets marked on the WAN via policy route NOT-RFC1918-IP together with the destination being a particular subnet was used to create different classes (this acting as a marker for packets that traverse the WAN interface and are heading to the internal subnets)

set traffic-policy shaper LAN-OUT bandwidth '4000mbit'
set traffic-policy shaper LAN-OUT class 10 bandwidth '1%'
set traffic-policy shaper LAN-OUT class 10 ceiling '15%'
set traffic-policy shaper LAN-OUT class 10 match CA_LAN2-NET ip destination address '10.254.167.0/24'
set traffic-policy shaper LAN-OUT class 10 match NOT-RFC1918-IP mark 19217210
set traffic-policy shaper LAN-OUT class 10 queue-type 'fq-codel'
set traffic-policy shaper LAN-OUT class 10 queue-limit 1000
set traffic-policy shaper LAN-OUT class 10 priority '1'

set traffic-policy shaper LAN-OUT class 20 bandwidth '4%'
set traffic-policy shaper LAN-OUT class 20 ceiling '15%'
set traffic-policy shaper LAN-OUT class 20 match TESLA_LAN-NET ip destination address '10.254.158.0/24'
set traffic-policy shaper LAN-OUT class 20 match NOT-RFC1918-IP mark 19217210
set traffic-policy shaper LAN-OUT class 20 queue-type 'fq-codel'
set traffic-policy shaper LAN-OUT class 20 queue-limit 1000
set traffic-policy shaper LAN-OUT class 20 priority '3'

set traffic-policy shaper LAN-OUT class 30 bandwidth '1%'
set traffic-policy shaper LAN-OUT class 30 ceiling '15%'
set traffic-policy shaper LAN-OUT class 30 match CA_LAN-NET ip destination address '10.254.157.0/24'
set traffic-policy shaper LAN-OUT class 30 match NOT-RFC1918-IP mark 19217210
set traffic-policy shaper LAN-OUT class 30 burst 128k
set traffic-policy shaper LAN-OUT class 30 queue-type 'fq-codel'
set traffic-policy shaper LAN-OUT class 30 queue-limit 1000
set traffic-policy shaper LAN-OUT class 30 priority '5'

set traffic-policy shaper LAN-OUT default bandwidth '25%'
set traffic-policy shaper LAN-OUT default ceiling '100%'
set traffic-policy shaper LAN-OUT default queue-type 'drop-tail'
set traffic-policy shaper LAN-OUT default burst 128k
set traffic-policy shaper LAN-OUT default priority '7'

set interfaces bonding bond0 vif 20 traffic-policy out LAN-OUT
set interfaces bonding bond0 vif 180 traffic-policy out LAN-OUT
set interfaces bonding bond0 vif 160 traffic-policy out LAN-OUT
...

Drop tail was used for the default policy since this should be mostly traffic between internal VLANs and I didn’t want traffic to be limited due CPU resources.

For WAN-OUT I was unable to come up with a work around to that would allow me to move the policy to the internal interfaces (that is have it match a packet that comes from a particular subnet and is destined for a non-RFC1918 address). So I went with single queue with no classes.

set traffic-policy shaper WAN0-OUT bandwidth '35mbit'
set traffic-policy shaper WAN0-OUT default bandwidth '25%'
set traffic-policy shaper WAN0-OUT default ceiling '100%'
set traffic-policy shaper WAN0-OUT default queue-type 'fq-codel'
set traffic-policy shaper WAN0-OUT default priority '7'

set interfaces ethernet eth0 traffic-policy out WAN0-OUT

The main goal here being that the VOIP traffic will fare well when the outbound connection is saturated.

Is the above a valid solution considering the wan failover and SNAT?

Thank you,

Jacob

Another Vyatta descendent, EdgeOS handles packet marking smarter.
Users can only use 8 LSBs of 32 bits mark, so user marks and LB marks can coexist.
(In iptables rules, you can mask mark bits setting a mark to value x using mark 000000FF will only touch 8 LSBs)
For WAN-Out policy, you might also be able to set DSCP value, and use those for QoS policy. (set to values that make sense, as this alteration is sent out)

Applying LAN-Out 3 times makes no sense
This will give you 3 seperate trees
With no sharing/borrowing BW in between classes
Alternatively, apply QoS policy to parent interface (bond0) , so single policy applies to all VLANs

I’m not sure Vyos can handle nested trees, but you might want something like


Bond0  (parent)   BW 10G
   IntraLAN   BW 10G
   From WAN1   20Mb/s
      class voip  
	  class data
	  class bulk
   From WAN2   40Mb/s
      class voip
      class data
     class bulk

Thank you 16again, I didn’t realize applying QoS to a parent interface would also apply to all attached VIFs, that’s much easier to manage. I was using the mentality of firewall zones where rules applied to zone BOND0 for instance wouldn’t apply to zone BOND0.VIF20. I’ll make the changes and will report back.

I’ve removed the policy from applying to the VIFs of bond0 and applied it to bond0 only. It does indeed apply to all child interfaces. Now I’m having problems with getting matching to work as expected.

With matches such as:

set traffic-policy shaper LAN-OUT class 10 match CA_LAN2-NET ip destination address '10.254.167.0/24'
set traffic-policy shaper LAN-OUT class 10 match NOT-RFC1918-IP mark 19217210

any WAN traffic destined to any subnet would get matched as class 10, no matter how many other classes there were (with subnets being the other matching factor). By entering the match like as in the above example am I saying to match either the mark or the subnet? I’m wanting it to match both in order to assign the class.

So I tried:

set traffic-policy shaper LAN-OUT class 10 match CA_LAN2-NET vif 80
set traffic-policy shaper LAN-OUT class 10 match CA_LAN2-NET mark 19217210

thinking that this would mean to match both, but still class 10 is applied to all traffic that had packets from WAN (I had to change from ip match to vif since protocol matches couldn’t be together with marks).

If I run an internal iperf test between local subnets then the correct class is selected. I’m guessing this is because there is no packet with the “WAN” mark that gets matched by the other classes first, so it goes through them all and lands on the right one based on the other matching item (subnet or vif)

One other thing that I found odd was that if I limited the bandwidth using % or mbps it wouldn’t yield the expected result. For example if I put in 2mbps as a ceiling a speed test would show 16Mbps where if I put in 2mbit the speed test would show 1.6Mbps. Is this a quirk or am I missing something?

Thank you!

After further testing it seems that:

match ABC option a
match XYZ option b

means that it will apply the class if it matches either.

Then:

match ABC option a
match ABC option b

means that both have to be matched for the class to apply.

This appears to be true so far for everything except when a mark is matched. If a mark is present as a match then it doesn’t matter if something else has to match in order for the class to apply, it will apply regardless. Is this expected?

in tc terms
matches within a single filter are a logical AND.
If you want an OR function, use 2 seperate filters, both pointing to same class
To see how vyos config has been translated into tc commands, use tc show commands
If you open 2nd ssh screen, running
sudo tc monitor
and on other screen commit vyos QoS change, you’ll see how config translates into tc commands

Thank you again for your help. After running your commands it appears that if there is a “mark” in the match the the other match variables are ignored.

Committing:

set traffic-policy shaper LAN-OUT class 10 bandwidth 4mbit
set traffic-policy shaper LAN-OUT class 10 ceiling 5mbit
set traffic-policy shaper LAN-OUT class 10 match CA_LAN2-NET vif 80
set traffic-policy shaper LAN-OUT class 10 match CA_LAN2-NET mark 19217210
set traffic-policy shaper LAN-OUT class 10 queue-type fq-codel
set traffic-policy shaper LAN-OUT class 10 priority 1

yeilds:

class htb 1:a dev bond0 parent 1:1 prio 1 rate 4Mbit ceil 5Mbit burst 15Kb cburst 1600b 
qdisc fq_codel 800b: dev bond0 parent 1:a limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 32Mb 
added chain dev bond0 parent 1: chain 0 
added filter dev bond0 parent 1: protocol all pref 1 basic chain 0 handle 0x1 flowid 1:a 
  meta(nf_mark eq 19217210)

Notice there is no match for vif 80.

Committing:

set traffic-policy shaper LAN-OUT class 40 bandwidth 20mbit
set traffic-policy shaper LAN-OUT class 40 ceiling 21mbit
set traffic-policy shaper LAN-OUT class 40 match CA_LAN-NET ip destination address 10.254.157.0/24
set traffic-policy shaper LAN-OUT class 40 match CA_LAN-NET ip source address 172.16.18.0/24
set traffic-policy shaper LAN-OUT class 40 queue-type fq-codel
set traffic-policy shaper LAN-OUT class 40 priority 5

yeilds:

class htb 1:28 dev bond0 parent 1:1 prio 5 rate 20Mbit ceil 21Mbit burst 15Kb cburst 1596b 
qdisc fq_codel 8017: dev bond0 parent 1:28 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 32Mb 
added filter dev bond0 parent 1: protocol all pref 3 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:28 not_in_hw 
  match ac101200/ffffff00 at 12
  match 0afe9d00/ffffff00 at 16

Notice the 2 matches, here they were a source and destination IP, not a mark.

Now if I delete the “mark” match from class 10 watch what happens:

delete traffic-policy shaper LAN-OUT class 10 match CA_LAN2-NET mark

yields:

class htb 1:a dev bond0 parent 1:1 prio 1 rate 4Mbit ceil 5Mbit burst 15Kb cburst 1600b 
qdisc fq_codel 8019: dev bond0 parent 1:a limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 32Mb 
added chain dev bond0 parent 1: chain 0 
added filter dev bond0 parent 1: protocol all pref 1 basic chain 0 handle 0x1 flowid 1:a 
  meta(vlan mask 0x00000fff eq 80)

Now the missing vif 80 shows up once the mark is deleted…

I also tried doing it with a interface match and a mark and it did the same thing:

set traffic-policy shaper LAN-OUT class 10 bandwidth 4mbit
set traffic-policy shaper LAN-OUT class 10 ceiling 5mbit
set traffic-policy shaper LAN-OUT class 10 match CA_LAN2-NET interface bond0.80
set traffic-policy shaper LAN-OUT class 10 match CA_LAN2-NET mark 19217210
set traffic-policy shaper LAN-OUT class 10 queue-type fq-codel
set traffic-policy shaper LAN-OUT class 10 priority 1

results:

class htb 1:a dev bond0 parent 1:1 prio 1 rate 4Mbit ceil 5Mbit burst 15Kb cburst 1600b 
qdisc fq_codel 8020: dev bond0 parent 1:a limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 32Mb 
added chain dev bond0 parent 1: chain 0 
added filter dev bond0 parent 1: protocol all pref 1 basic chain 0 handle 0x1 flowid 1:a 
  meta(nf_mark eq 19217210)

It shows the mark only… And here is the the show config:

show traffic-policy shaper LAN-OUT class 10
 bandwidth 4mbit
 ceiling 5mbit
 match CA_LAN2-NET {
     interface bond0.80
     mark 19217210
 }
 priority 1
 queue-type fq-codel

So do you know if this is a known issue, the wrong way to do this, a bug or a trick to teach perpetual newbies like me something :slight_smile:

This makes me wonder if U32 filter can be ANDed with other filters. In your examples, ANDing only works having multiple U32 matches
As you don’t see any error: Maybe mark match is applied last, overwriting previous u32.
Which can be tested by building tc tree entirely by hand, step for step

Instead of marking I tried instead assigning a dscp value to non-RFC1918 IPv4 traffic from WAN0. I was able to couple this with a destination address and it worked. Unfortunately this means I won’t be able to use dscp values for VOIP QoS. However this setup is temporary so hopefully it doesn’t cause too much headache; though the current “temporary solution” has lasted 7 years at this location…

This is what I did instead of marking:

set policy route WAN0-KITS rule 20 set dscp 6
set policy route WAN0-KITS rule 20 source group network-group !RFC1918-IP
set interfaces ethernet eth0 policy route WAN0-KITS

The used the dscp value and destination subnet to allow WAN traffic in to be shaped per subnet:

set traffic-policy shaper OUT-LAN class 10 bandwidth 50mbit
set traffic-policy shaper OUT-LAN class 10 ceiling 600mbit
set traffic-policy shaper OUT-LAN class 10 match WAN_to_CA_LAN2 ip destination address 10.254.167.0/24
set traffic-policy shaper OUT-LAN class 10 match WAN_to_CA_LAN2 ip dscp 6
set traffic-policy shaper OUT-LAN class 10 burst 15k
set traffic-policy shaper OUT-LAN class 10 queue-type fq-codel
set traffic-policy shaper OUT-LAN class 10 queue-limit 20
set traffic-policy shaper OUT-LAN class 10 priority 2

set traffic-policy shaper OUT-LAN class 20 bandwidth 140mbit
set traffic-policy shaper OUT-LAN class 20 ceiling 600mbit
set traffic-policy shaper OUT-LAN class 20 match WAN_to_TESLA_LAN ip destination address 10.254.158.0/24
set traffic-policy shaper OUT-LAN class 20 match WAN_to_TESLA_LAN ip dscp 6
set traffic-policy shaper OUT-LAN class 20 burst 15k
set traffic-policy shaper OUT-LAN class 20 queue-type fq-codel
set traffic-policy shaper OUT-LAN class 20 queue-limit 20
set traffic-policy shaper OUT-LAN class 20 priority 1

set traffic-policy shaper OUT-LAN class 30 bandwidth 50mbit
set traffic-policy shaper OUT-LAN class 30 ceiling 600mbit
set traffic-policy shaper OUT-LAN class 30 match WAN_to_CA_LAN ip destination address 10.254.157.0/24
set traffic-policy shaper OUT-LAN class 30 match WAN_to_CA_LAN ip dscp 6
set traffic-policy shaper OUT-LAN class 30 burst 15k
set traffic-policy shaper OUT-LAN class 30 queue-type fq-codel
set traffic-policy shaper OUT-LAN class 30 queue-limit 20
set traffic-policy shaper OUT-LAN class 30 priority 5

So far the matching that I’ve tested:
mark and anything (if allowed) means mark only
mark and vif or interface is allowed but still only matches the mark
mark and IP address isn’t allowed
IP source and IP dest works
IP dscp and IP source works

Now to do the reverse for what would be WAN-OUT, which I’m thinking will need to be LAN to IFB-OUT. My mind is a bit worn and the next call is with a legal dept to explain why tiff files with an index isn’t the metadata we’re talking about so I’ll pause here. Thank you again for helping me through this, your direction has given me insight and motivation to keep trying.

QoS, can’t it do what I want it to do, just for system administrator appreciation day???

I’m trying to set per subnet ingress traffic priorities (from WAN). It seems I can’t do this on the WAN via IFB interface because DNAT hasn’t yet happened, so specifying a match as the destination NATed subnet doesn’t work. Or am I missing something?

So I’m using a LAN-OUT policy and applying it to bond0 (which contains all of my VLANs).

Bandwidth is set to 3600mbit because it’s 4 bonded 1gbit interfaces

set traffic-policy shaper OUT-LAN bandwidth 3600mbit
set traffic-policy shaper OUT-LAN default bandwidth 3000mbit
set traffic-policy shaper OUT-LAN default ceiling 3600mbit
set traffic-policy shaper OUT-LAN default queue-type fq-codel
...

Actual WAN download is 600mbit so my class policies reflect this in the ceiling values, the dscp 6 match is because WAN packets get tagged as such, this way this match only applies when it’s an ingress packet from the WAN destined for an internal subnet.

set traffic-policy shaper OUT-LAN class 30 bandwidth 50mbit
set traffic-policy shaper OUT-LAN class 30 ceiling 540mbit
set traffic-policy shaper OUT-LAN class 30 match WAN_to_CA_LAN ip destination address 10.254.157.0/24
set traffic-policy shaper OUT-LAN class 30 match WAN_to_CA_LAN ip dscp 6
set traffic-policy shaper OUT-LAN class 30 queue-type fq-codel
...

Repeat about 10 more classes and we’re all done, now comes testing.

When testing I’ve found that the ceiling is honored but the guaranteed bandwidth isn’t, rather it just evens out distribution among the classes being tested in parallel (different priorities are in use).
Then this sentence that I’ve read over and over again made sense:

FQ-Codel is a non-shaping (work-conserving) policy, so it will only be useful if your outgoing interface is really full. If it is not, VyOS will not own the queue and FQ-Codel will have no effect.

So this means to me that unless I’m pushing the 3600mbit of the configured bandwidth then my classes won’t shape bandwidth.

To test this I dropped the 3600mbit bandwidth down to 540mbit to reflect the actual WAN download speed, then when testing in parallel things worked as expected. But this also means that my inter-LAN traffic will be limited the same 540mbit throttle.

Here I tried to be clever and made another class to match packets destined for private subnets only and put the ceiling to 3600mbit, it allowed the command but tests were still limited to 540mbit (and the expected class was being matched).

The basic goal is to take 600mbit of WAN download and make it fully available to any internal subnet but to also guarantee a minimum bandwidth to specific subnets when available bandwidth becomes limited, is this possible via the native VyOS commands?

As it stands (per my limited knowledge), to get this along with dual wan failover and NAT I’d need to have 2 VyOS devices. [WANFAILOVER/NAT/QoS-ingress via LAN-OUT] <-> [QoS-egress EXT-OUT/LAN-Router]. Is this the way to do it?

Thank you.

Work conserving=non-shaping means fq-codel itself doesn’r have BW knob. That’s why it’s used in htb tree , which does the shaping to max/ceiling. And fq-codel gives fairness within that class. So that isn’t your problem
Not having sub-branches is
With tc you can create nested branches:

Interface  (set BW to 10G )
   branch no-qos  (set BW to 10G)
   branch downWAN1  (set to 100mb/s)
      sub-branch voip1   (set bw/ceil)
      sub branch interactive1  (set bw/ceil)
      sub branch default1  (set bw/ceil)   
   branch downWAN2  (set to 100mb/s)
      sub-branch voip2   (set bw/ceil)
      sub branch interactive2   (set bw/ceil)
      sub branch default2  (set bw/ceil)

But I don’t think you can in shaper in VyOS