Outbound IPsec filtering by firewall - would like some dev opinions

In order to address the tickets ⚓ T4694 Allow VyOS Firewall to Match Outbound IPSec Traffic and ⚓ T4667 DMVPN IPSec allows cleartext GRE over the internet when reconnecting, I’m attempting to extend firewall syntax. The associated draft PRs are 3616 and 3637.

TL;DR so far:

  • If IPsec SAs go down, transport mode payloads have no way of knowing it, and will happily traverse in the clear
  • Builtin workarounds in strongswan trap a lot of potentially legit traffic, not just the stuff that it should be encrypting
  • meta ipsec” is not available at all from output chains and only matches decrypted payloads (inbound)
  • To match outbound payloads, we need to use “rt ipsec”, matching packets to be encrypted
  • At some point the firewall was reworked to use “ipsec match-(ipsec|none)” syntax for inbound only (meaning meta ipsec exists|missing).

The idea behind this PR pair are:

  1. Extend the firewall to be able to directionally match IPsec encap in the appropriate hooks - most importantly, input, output, forward filters
  2. Extend the firewall to be able to specifically target GRE sessions by their primary attributes - we can already match source, destination and protocol, tunnel-key was missing

At this point, it does work to manually fix the issue at hand and I’ve done some basic testing of normal IPsec operations - all OK. Before I go all in and spend a lot of time double checking everything in lab lash-ups, I wanted to float the design past the devs to make sure it’s all in order.

The main things on my mind are:

  • I’d rather not change syntax if at all possible, but I can’t think of any other way to do this so that it is obvious to users how it works. I considered tags underneath the current match-* nodes, but this is still a mild syntax change. Plus, “ipsec match-ipsec” for inbound vs “ipsec match-ipsec out” for outbound is inconsistent with other usage of the “match-*” style elsewhere and not as clear in intent.
  • GRE key matching is very clumsily implemented, I’m not at all happy with it. Because the packet structure changes depending on flags, primarily just “checksum”, there is no native match in nftables and raw-matching is painful. I see a couple of ways to handle it:
    • Generate 2 nft rules matching against the flag bits and the appropriate raw offset. This would be transparent to the user, but the firewall → nft generator doesn’t work that way - it is 1 firewall rule to 1 nft rule
    • Force the user to configure a flag match on “checksum”, so the user has to put in 2 rules if they want to match tunnel keys with and without a checksum. This is what I ended up doing, with hints in the validator message that the checksum flag is usually 0 anyway so they can just go with that unless there’s a problem.
    • If someone can come up with a single-rule expression that can match both packet offsets (depending on flag bits), I’ll be much, much happier, tho a tad annoyed that I spent hours picking through nft opcodes, kernel code and mailing lists trying to figure it out and failing.
  • To maintain feature compatibility with custom named firewalls, both directional matches are possible. However, nft will reject loading any usage of “meta ipsec” inside an output hook. Thus, I’ve implemented a “loop check”:
    • nft will also fail to load cyclic table referrals. For eg, input filter → jump-target A → jump-target B → jump-target A is a current uncaught validation problem
    • Having an ipsec match in a custom named firewall chain that is an output jump target is another current uncaught validation problem
    • I’ve implemented a simple cycle checker with some limitations, that also double-checks output chains contain no “ipsec match-*-in” matches.
    • This could be implemented differently (less expensively) with tweaks to how custom firewalls work, but that would be a much larger structural change outside the scope of this PR set

I don’t want to convert them to full PRs until I’ve tested them, but I was just after some feedback before going ahead and spending a few hours double-checking that everything works appropriately.