Same L3 segment routing without ICMP redirects?

Hello, new to VyOS. Using as my home router/firewall, but looking for help with either a firewall or system configure issue. Here is the context:

  • VyOS 1.5-Rolling router
  • hosted in ESXi
  • two nics (LAN and WAN)
  • LAN side is 10.5.5.2
  • All nodes on network use 10.5.5.2 as GW
  • 10.5.5.14 hosts my tailscale client
  • I have a static route for 100.x.x.x/12 via 10.5.5.14 within VyOS
  • 100.x.x.x is the IP to one instance on my tailscale network
  • I’m essentially using Quick Start — VyOS 1.5.x (circinus) documentation as my starting point.

I’ve previously had this working on my EdgeOS system, but not quite sure what is different between that and the VyOS system. Comparing the configs isn’t trivial since they are quite different.

I’d like to have VyOS route traffic to a router on the same L3 segment without using ICMP redirects. The router is not resource constrained. I’d rather avoid the extra chatter of the ICMP redirects if I can.

Here is my problem that I’m noticing. I’m trying to access a HTTP service on 100.x.x.x:9000 from a machine (10.5.5.15) that doesn’t accept [ICMP] redirects. This is apparent because I see host 10.5.5.15/if2 ignores redirects for 100.x.x.x. to 10.5.5.14 within dmesg on my VyOS box. Other machines with accept [ICMP] redirects can complete the request just fine. Enabling this machine to accept redirects allows curl to complete the request. This also updates my ip route get 100.x.x.x to return back 10.5.5.14 instead of the default gateway, the whole purpose of ICMP redirects.

What’s interesting is that some packet get routed as seen via tcpdump on 10.5.5.14 or 100.x.x.x, but the TCP stream cannot complete the connection to transfer the rest of the packets. This makes me think there is some state that gets built up during the first few packets, but then gets filtered out.

My understanding is that even through the kernel notices there is a better route, it will still forward the packet to my destination, but also notify my host of that better route. This is why I think I see partial packets land at the destination. Even if I tell VyOS to disable send-redirects curl continues to not complete the request.

The problem is thus

Your host 10.5.5.15 doesn’t know how to get to 100.x.x.x/12, so it routes it via your router (Default route of 10.5.5.2) which routes it to 10.5.5.14. All good. Traffic goes off, and the return traffic then ends up back at 10.5.5.14. 10.5.5.14 knows it has to send it to 10.5.5.2 and it’s directly connected, it doesn’t send it back to the router.

The simple fix is on your host 10.5.5.14 to add a /32 route to 10.5.5.15 via 10.5.5.2:

[These drawings are terrible but I’m so used to lucid chart and I can’t figure out draw.io]

3 Likes

Adding the route on .14 to .15 via .2 does work, but I’m confused on why that worked.

Without the route setup on .14 this is what I see on .14.

Additional context:

  • :51:01 is mac of .2
  • :ae:39 is mac of .14
  • :c3:1b is mac of .15

For example:

10:52:58.803007 XX:XX:XX:XX:51:01 > XX:XX:XX:XX:ae:39, ethertype IPv4 (0x0800), length 74: 10.5.5.15.45518 > 100.X.X.X.9000: Flags [S], seq 241158184, win 64240, options [mss 1460,sackOK,TS val 3658754106 ecr 0,nop,wscale 7], length 0
10:52:58.849865 XX:XX:XX:XX:ae:39 > XX:XX:XX:XX:c3:1b, ethertype IPv4 (0x0800), length 74: 100.X.X.X.9000 > 10.5.5.15.45518: Flags [S.], seq 1035198793, ack 241158185, win 65160, options [mss 1460,sackOK,TS val 3817472965 ecr 3658754106,nop,wscale 6], length 0

We see the ethernet frame is rewritten to use the .2 mac address as the source. The tcp segment references the correct value: 10.5.5.15.45518 > 100.X.X.X.9000.

Return path ethernet frame has mac address of .14 as the source. Correct. The dest mac is .15 with the tcp segment being 100.X.X.X.9000 > 10.5.5.15.45518, but I’m not sure why this packet isn’t successfully.

Additionally, I see .15 receive this packet from .14. Is the reason why this received packet isn’t accepted because of the state of the TCP connection? Is it an issue with the ephemeral ports? Is this a security feature as we should only accept packets back from the mac address we originally sent them to?

This is with the route enabled on .14

10:55:10.377441 XX:XX:XX:XX:51:01 > XX:XX:XX:XX:ae:39, ethertype IPv4 (0x0800), length 147: 10.5.5.15.45518 > 100.X.X.X.9000: Flags [P.], seq 241158185:241158266, ack 1035198794, win 502, options [nop,nop,TS val 3658885680 ecr 3817472965], length 81
10:55:10.417804 XX:XX:XX:XX:ae:39 > XX:XX:XX:XX:51:01, ethertype IPv4 (0x0800), length 54: 100.X.X.X.9000 > 10.5.5.15.45518: Flags [R], seq 1035198794, win 0, length 0

It appears the only difference is the ethernet frame is rewritten to use the mac address of .2 as the destination.

My other question is how I can make this better? I like have my tailscale instance (100.x.x.x) handled by a separate computer in my network, but I don’t like making the static routes all over the place. In theory I could potentially setup a static route on .14 for my entire subnet (10.5.5.0/24) to route through .2?

Yea I’m not 100% sure why it’s not working. I’ve seen similar issues in my network before and the reason was because of the following:

Your router (10.5.5.2) sees the first packet 10.5.5.15->100.x.x.x and goes “That was a TCP SYN, great, but I’m not going to setup connection tracking until I see a SYN/ACK in the opposite direction”

When you bypass the router for traffic by going .14->.15 then the router never gets to see the returning SYN/ACK, doesn’t setup a connection tracking session, and thus drops any traffic that it gets from 10.5.5.15 towards 100.x.x.x because it doesn’t have a valid conntrack state and isn’t a TCP SYN.
Adding that static route in via the router allows it to see the returning SYN/ACK so it sets up connection in the connection tracking table and everything works.

It’s my understanding though that Vyos 1.4/1.5 has “tcp loose” connection tracking by default (which is really just how the Linux kernel defaults to) which should allow this just fine, i.e. it doesn’t require the strictness of having seen an SYN/ACK before allowing session creation. But maybe the router still needs to see two way traffic (just not the full syn/ack) and thus starts to drop traffic for this particular tcp session from 10.5.5.15 because it’s never seeing 2-way traffic.

So what’s actually the problem here, I don’t know (unless you’ve turned on set system conntrack tcp loose 'disable').

Debug the issue more by looking at the connection tracking table on your router
conntrack -L -s 10.5.5.5

You could try bypassing conntrack for anything from 10.5.5.15/24 to 100.x.x.x/16 and see if that resolves it, if so it tends to point to my suggestion above as being the issue.

set system conntrack ignore ipv4 rule 10 description 'Ignore Conntrack as a test
set system conntrack ignore ipv4 rule 10 destination address '100.x.x.x/y'
set system conntrack ignore ipv4 rule 10 source address '10.5.5.5/32'

The proper way to do this, of course, is to have a linknet Interface off your router with a /30 or /31 and route traffic over it that way, that way return traffic has to come back via your router.

Thank you for the detailed explanation. I might need to grab my old networking textbooks and re-read through the TCP/SYN side of things.

I did check to see if ‘tcp loose’ connection tracking was disabled and it wasn’t.

I also ignored the conntracking for the entire subnet and that appears to work fine. I’ll need to read through conntracking a little more to understand what impacts it has and why someone would want to ignore it. (Obviously for cases like this but curious on other cases)

I’ll be curious if EdgeOS does conntracking automatically or uses “tcp loose” conntracking.

As for the suggestion on the linknet interface. I tried looking that up and didn’t get very far. Could you elaborate on what you mean by that?

1 Like

Connection Tracking is used for many things. 1 of them is to have a stateful firewall, so you can say “Let traffic out this interface, but only let traffic back IN this interface if it’s a reply to a session created by a host on the other side”. Connection tracking is also used by NAT to keep the record of original IP/rewritten IP etc. So yes, you have to be careful about what turning off connecting tracking will do. If your router is purely that, just a router with no stateful firewall features then you can fully disable connection tracking. I wouldn’t suggest this though.

As for a linknet, I mean, creating another interface on your router towards the router that has the 100.x network behind it. Something like 192.168.1.0/30 and your Vyos router is 192.168.1.1 and your other device is 192.168.1.2

On your Vyos router you say “To get to 100.x next-hop 192.168.1.2” and on your other router you say “To get to 10.5.5.0/24 next-hop 192.168.1.1”

That way traffic must follow that route, because now your 100.x device doesn’t HAVE a directly connected 10.5.5.0/24 network.

You would either create another interface (eth2) on your Vyos router, or you could just assign a second IP address to the 10.5.5.0 interface.

I expect that EdgeOS does connection tracking by default.

Good luck!

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.