Firewall assignment and DNAT

reno138 · May 16, 2020, 11:21pm

Hello all!

I have been using vyos for several years at home for simple nat/dhcp/openvpn, and general internet access for my family.
I have an ESXi host at a colo that I’ve been running vyos on for several months, and just realized a couple days ago that I never assigned a firewall to the WAN interface. Cool, eh?
There’s not anything on there that’s any kind of secret. Just a few game servers, mail server, and a DNS server.

When I went to assign the firewall named WAN to the wan interface (eth0) either nothing happens, or everything is borked. I’m sure that you folks that fall in the “way smarter than me” category, which would be almost all of your when it comes to this side of IT, can point to it and say “here’s where you did the bad thing.” Thanks for taking the time to look at this
Side note - there are several more firewall rules and DNAT rules, but they are all the same. just different port groups and address groups. There are 3 public IPs on eth0 and 3 vlans/subnets inside on eth1.18, eth1.19, and eth1.20. No firewalls between the subnets as they are only for a logical split, not security related.

Here’s where I run into problems -

If I set the firewall to the wan interface (eth0) as “in,” the below rules do not allow the traffic, and nothing works.
If I set the firewall to eth0 as “local,” the traffic below is allowed if the firewall rule is in place or not. It’s just cruising
through on the DNAT rule. The firewall and DNAT rules are below.

And, all this time I thought I was getting this stuff down pretty well, but, here I am back at square one.

vyos@rtfw-01# show firewall name WAN rule 1801
 action accept
 destination {
     group {
         address-group WAN-139
         port-group ZIMBRAPORTS
     }
 }
 log enable
 protocol tcp
 state {
     new enable
 }


vyos@rtfw-01# show nat destination rule 1801
 description "Zimbra DNAT"
 destination {
     address <wan interface IP redacted>
     port 25,80,110,143,465,587,993,995,443
 }
 inbound-interface eth0
 log enable
 protocol tcp
 translation {
     address <lan interface IP redacted>
 }

The WAN-139 address group has the external destination IP address in it
The ZIMBRAPORTS port group has all the needed ports in it, as it matches the NAT destination rule

The default action on the firewall is drop.


 default-action drop
 rule 1801 {
     action accept
     destination {
         group {
             address-group WAN-139
             port-group ZIMBRAPORTS
         }
     }
     protocol tcp
     state {
         new enable
     }
 }

TIA

–reno

tjh · May 17, 2020, 10:39pm

Are you testing with a NEW connection, or an existing connection, expecting it to break?

I’m pretty sure that applying a rule to the Internet will not flush already existing states.

reno138 · May 18, 2020, 4:17am

Testing with new connections. As in, applying the firewall to my wan interface and trying to access the servers that have dnat rules. Web server, mail server, etc. For the sake of simplicity, I just used the firewall and dnat rules for my mail server.

I can access it if the firewall isn’t assigned. I cannot access it if the firewall is assigned to wan “in”, and it doesn’t matter if the rule is there or not if the firewall is assigned to wan “local.” That last bit I expect, because it’s not destined for the vyos box. But, if it’s assigned to my wan interface as a “in” firewall, that rule should allow the traffic, shouldn’t it? I know 100% that the dnat rule works, because without the firewall I can access the email web interface, and send/receive email.

I know I’m missing something completely obvious here, but sometimes my head doesn’t get around this network stuff very well.

tjh · May 18, 2020, 4:39am

To make this a bit easier: You want to be applying your rules to “WAN-IN” not “WAN-LOCAL”. WAN-LOCAL is only for traffic that, even once NAT is passed, are destined for your actual Firewall. For example your OpenVPN termination etc.

WAN-IN is traffic that is “coming in the WAN interface, going somewhere else” i.e. your NAT rules.

I think your problem is you’re missing “established enable” on your WAN in rule. So you allow the initial “new” syn traffic, but that’s all you’re letting in. ACKs are dropped etc. I would either add “established enable” and (I’m not 100% sure what this does myself, need to read doco) “related enable” to your “state” section and see if that doesn’t fix it.

Or just drop the “state” bit all together to start with.

reno138 · May 20, 2020, 3:04am

Unfortunately neither one of those options worked. I’m not sure what I’m doing wrong here. I even enabled logging on these rules, but the logs aren’t being populated either. It’s almost like the traffic isn’t making it there at all. But, I know it is since I can access the site behind the firewall without the firewall pinned to the WAN interface.

I’m still plunking around with this. I’ve been using vyos for simple single WAN IP, port forwards, openvpn server/site-to-site, and NAT. Now that I actually want to DO something with it, I’m lost… The thing is, I KNOW it’s just the way it works isn’t how I have it in my head, and once I see a working example, It’ll all click. However, since I’m starting with little to no base, piecing the docs together to get where I need to be is kind of a leap.

I really appreciate you guys trying to point me in the right direction.

tjh · May 20, 2020, 3:35am

Yea I can’t figure it out either sorry!

Are you 100% sure you don’t have a firewall rule above 1801 that’s catching the traffic and dropping it before it gets to 1801?

And WAN-139 100% has the IP address of eth0 contained within it?

It otherwise appears to me that it should work. I would turn all logging on and check “dmesg” to see if it’s being logged there, I’ve had problems with v1.2.5 not logging to the firewall logs.

reno138 · May 20, 2020, 4:18am

do you mean like, 1803 dropping it? I do have a rule 1803, but it’s for a different server and a different service.
So -
let’s say I have 2 servers, ServerA, and ServerB, both internal on private IPs

1801 Allows web and mailflow to ServerA
1802 allows mumble and jabber to server b
all the 18xx range rules land on WAN-IP1
19xx rules land on WAN-IP2
20xx rules land on WAN-IP3
I just did this to keep everything straight and lined up with the vlans, and i use the same number for the DNAT rules so it’s easy to keep track of which rules go together.

The rules in the 18xx range are dest

1801 is defined as Traffic destined for WAN-IP1 on web/mail ports, should go to serverA
1802 is defined as traffic destined for WAN-IP2 on web/jabber should go to serverB
The WAN IPs and ports for each server are defined in firewall address groups and port groups.

All 3 WAN IPs are on a single physical interface.
Does that clear up any confusion? Maybe now you can see something and say “hey, that’s really a bad, not working, probably worst config I’ve ever seen” and we’ll know where we stand.

tjh · May 20, 2020, 4:55am

That all makes sense. I just wondered if you had a 1600 rule that might be dropping it.

Ok so I wonder if the problem might be your address-group WAN-139
I’m not familar enough to know the exact processing order, but seeing as you’re applying this to the WAN interface, I suspect when it’s making the firewall lookup decision that NAT hasn’t applied yet, so the destination IP is still the public IP. So having a private IP in WAN-139 probably isn’t matching and thus the rule isn’t being hit.

As a test, just remove that line, the “address-group WAN-139” leaving the port-group and see if it works then?

reno138 · May 20, 2020, 5:22am

The IP in the WAN-139 group is a public IP. it’s just using 139 as that’s the last octet of the public IP.
the WAN-* all 3 contain one public IP. The DNAT rule is the only one that references the private IPs in the “translation address” section of the DNAT rule. Otherwise, the firewall rule only references the public IP that was the destination, and the port(s).

The only explicit drop is in the default action for this firewall. Otherwise, all rules exist only to allow.

So, it still makes no sense. Especially the logs being empty. Because I’m definitely getting to that address. I’m wondering if there’s something in there that is separate that’s stopping it before that rule is triggered. Which is pretty much what you just said, but the only rule before 1801 is the default deny, that isn’t even in a numbered rule, just the default action.

But, I’m still very lost. I get how it’s supposed to work, but I can’t seem to get my head around the config to get there, or why the rules aren’t even being hit. Or, if the rules are getting triggered, but there’s another reason they aren’t allowing the traffic and the firewall logging is broken… I’m grasping at straws…

tjh · May 20, 2020, 6:47am

Type “dmesg” that will show you firewall logs, even if the logging is broken (like it sadly is for me)

So if the public IP’s the one in there, maybe replace it with the private IP - prehaps NAT takes effect before the firewall.

reno138 · May 20, 2020, 3:18pm

OK, you nailed it. NAT is happening before firewall. Which seems weird to me, but, whatever. It works!

Thanks for working through it with me! Now I see the DNAT and firewall logs in the log messages.

tjh · May 20, 2020, 6:41pm

Hooray! Glad we got there in the end - and hey, we both learnt something