Why do my outgoing TCP connections fail when ICMP and incoming connections are OK?

Just to get this out of the way from the beginning: No firewall/packet filter (yet).

My simple policy routing (partially discussed in my previous questions [1-3]) drives me nuts. With your help (pointing me to binding DHCP instance of DSL to a VRF and using vrf bind-to-all), I am close to where I want to be but for some strange reason, outgoing TCP connections don’t work.

My setup (VyOS 1.3):

  • A DSL uplink on eth0.2 with dynamically assigned IP & default route
  • A public routed /24 (192.0.2/24) via a wireguard tunnel from a VPS endpoint
  • A test network 192.0.2.241/29 assigned to a dummy interface (to test routing and OSPF)
  • Any traffic should consult local routing information (tables local, main, incl OSPF) if there is a more specific entry than 0.0.0.0/0
  • For any traffic that only matches 0.0.0.0/0 (default route):
    • If it has a source address in 192.0.2/24 should be routed over the wireguard tunnel
    • If not, it should go over the DSL connection

In order to achieve this, I did the following [1]:

  • Create a VRF “vrf_dsl” with table 170 and bind it to eth0.2. As a result, the default route (assigned via DHCP) lands in table 170 and NOT the main table
  • I am actually not 100% sure what vrf bind-to-all does but it was suggested in [1] and afterwards, it made SSH and wireguard working
  • The “local-route” policy consults first local and main (both of which do NOT include a default route). Rule 103 sends all packets from 192.0.2/24 over the wireguard link and rule 104 jumps to the default route (catch-all).

At first glance, everything works: Wireguard tunnel is established, resolver (DNS) works and all ICMP ping works. Example:

vyos@SunGate1:~$ ping www.google.com
PING www.google.com (172.217.12.100) 56(84) bytes of data.
64 bytes from sfo03s33-in-f4.1e100.net (172.217.12.100): icmp_seq=1 ttl=112 time=4.22 ms
64 bytes from sfo03s33-in-f4.1e100.net (172.217.12.100): icmp_seq=2 ttl=112 time=4.26 ms

However, I cannot establish a HTTP connection (or any other, like ssh to any server):

vyos@SunGate1:~$ /bin/telnet www.google.com 80
Trying 172.217.12.100...

^C

But now comes the kicker: If start a bash session within the VRF context I can connect!

vyos@SunGate1:~$ sudo ip vrf exec vrf_dsl bash
root@SunGate1:/home/vyos# /bin/telnet www.google.com 80
Trying 172.217.12.100...
Connected to www.google.com.
Escape character is '^]'.
GET / HTTP/1.0

[...]
</script></body></html>Connection closed by foreign host.
vyos@SunGate1:~$

W-T-F-??
I really don’t get it. Ping works, tcp not. And the policy very clearly has the proper rules set for the default route. I can also confirm:

vyos@SunGate1:~$ sudo ip rule
101:    from all lookup local
102:    from all lookup main
103:    from 192.0.2.0/24 lookup 171
104:    from all lookup vrf_dsl
1000:   from all lookup [l3mdev-table]
2000:   from all lookup [l3mdev-table] unreachable
32765:  from all lookup local
32766:  from all lookup main
32767:  from all lookup default
vyos@SunGate1:~$ sudo ip route show table main | grep '0.0.0.0'
vyos@SunGate1:~$ sudo ip route show table vrf_dsl
default nhid 22 via DSL-PUBLIC-ROUTER dev eth0.2 proto static metric 20 
broadcast 127.0.0.0 dev vrf_dsl proto kernel scope link src 127.0.0.1 
127.0.0.0/8 dev vrf_dsl proto kernel scope link src 127.0.0.1 
local 127.0.0.1 dev vrf_dsl proto kernel scope host src 127.0.0.1 
broadcast 127.255.255.255 dev vrf_dsl proto kernel scope link src 127.0.0.1 
broadcast DSL-PUBLIC-NETWORK dev eth0.2 proto kernel scope link src DSL-PUBLIC-IP 
DSL-PUBLIC-NETWORK/21 dev eth0.2 proto kernel scope link src DSL-PUBLIC-IP 
local DSL-PUBLIC-IP dev eth0.2 proto kernel scope host src DSL-PUBLIC-IP
broadcast DSL-PUBLIC-BCAST dev eth0.2 proto kernel scope link src DSL-PUBLIC-IP 

Since any TCP connection (like telnet in my example above) runs in the default VRF context, it should normally run through the rules. It would first try 101 and not find a match. Similarly no match for 102 in main. Rule 103 does not match in the first place. Then rule 104 matches, it looks up table vrf_dsl and finds the default route there.

It MUST work. But it doesn’t.

Is anyone able to tell me what the heck is going on here?

Finally, full config for reference.

vyos@SunGate1:~$ show conf com
set interfaces dummy dum0 address ‘192.0.2.241/29'
set interfaces ethernet eth0 vif 2 address 'dhcp'
set interfaces ethernet eth0 vif 2 description ‘dsl’
set interfaces ethernet eth0 vif 2 vrf 'vrf_dsl’
set interfaces ethernet eth0 vif 3 address '10.227.79.2/24'
set interfaces ethernet eth0 vif 4 address '10.227.4.2/24'
set interfaces ethernet eth0 vif 5 address '10.227.1.2/24'
set interfaces ethernet eth0 vif 10 address '10.227.80.2/24'
set interfaces ethernet eth0 vif 11 address '192.168.222.2/24'
set interfaces ethernet eth0 vif 12 address '192.168.223.2/24'
set interfaces loopback lo
set interfaces wireguard wg0 address ‘192.0.2.227/31'
set interfaces wireguard wg0 description 'Vultr uplink'
set interfaces wireguard wg0 ip ospf authentication md5 key-id 1 md5-key ‘*’**********
set interfaces wireguard wg0 ip ospf cost '100'
set interfaces wireguard wg0 ip ospf dead-interval '40'
set interfaces wireguard wg0 ip ospf hello-interval '10'
set interfaces wireguard wg0 ip ospf network 'point-to-point'
set interfaces wireguard wg0 ip ospf priority '1'
set interfaces wireguard wg0 ip ospf retransmit-interval '5'
set interfaces wireguard wg0 ip ospf transmit-delay '1'
set interfaces wireguard wg0 peer vultr0 address ‘*’***********
set interfaces wireguard wg0 peer vultr0 allowed-ips '0.0.0.0/0'
set interfaces wireguard wg0 peer vultr0 port '51821'
set nat source rule 100 outbound-interface 'eth0.2'
set nat source rule 100 translation address 'masquerade'
set policy local-route rule 101 destination '0.0.0.0/0'
set policy local-route rule 101 set table 'local'
set policy local-route rule 102 destination '0.0.0.0/0'
set policy local-route rule 102 set table 'main'
set policy local-route rule 103 destination '0.0.0.0/0'
set policy local-route rule 103 set table '171'
set policy local-route rule 103 source ‘192.0.2.0/24'
set policy local-route rule 104 destination '0.0.0.0/0'
set policy local-route rule 104 set table '170'
set protocols ospf area 0.0.0.0 network ‘192.0.2.0/24'
set protocols ospf parameters abr-type 'cisco'
set protocols ospf parameters router-id '10.227.1.2'
set protocols ospf passive-interface 'eth0.2'
set protocols static table 171 route 0.0.0.0/0 next-hop 192.0.2.226
set service ssh port ’22’
set system conntrack modules ftp
set system conntrack modules h323
set system conntrack modules nfs
set system conntrack modules pptp
set system conntrack modules sip
set system conntrack modules sqlnet
set system conntrack modules tftp
set system console device ttyS0 speed '115200'
set system host-name 'SunGate1'
set system name-server 'eth0.2'
set system ntp server time1.vyos.net
set system ntp server time2.vyos.net
set system ntp server time3.vyos.net
set system syslog global facility all level 'info'
set system syslog global facility protocols level 'debug'
set system time-zone 'America/Los_Angeles'
set vrf bind-to-all
set vrf name vrf_dsl table '170'

[1] Putting DHCP default gateway in different routing table - #7 by exp
[2] How can I make my wireguard tunnel accessible via policy routing?
[3] Simple source routing not so simple? (How to migrate this simple RouterOS example to VyOS)

Try dump the traffic to detect which source address via which interface is outgoing.
I feel you need route leaking default route to the main table

Yes I did this already:

Terminal 1:

vyos@SunGate1# sudo /bin/telnet 172.217.12.100 80
Trying 172.217.12.100...

Terminal 2:

vyos@SunGate1:~$ sudo tcpdump -n -i eth0.2 'host 172.217.12.100'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0.2, link-type EN10MB (Ethernet), capture size 262144 bytes
00:26:57.176329 IP 192.184.144.21.45044 > 172.217.12.100.80: Flags [S], seq 3175474548, win 64240, options [mss 1460,sackOK,TS val 317393081 ecr 0,nop,wscale 7], length 0
00:26:57.180374 IP 172.217.12.100.80 > 192.184.144.21.45044: Flags [S.], seq 500075723, ack 3175474549, win 65535, options [mss 1412,sackOK,TS val 2087272337 ecr 317393081,nop,wscale 8], length 0
00:26:57.180503 IP 192.184.144.21.45044 > 172.217.12.100.80: Flags [R], seq 3175474549, win 0, length 0
00:26:58.208846 IP 192.184.144.21.45044 > 172.217.12.100.80: Flags [S], seq 3175

For some freakin’ reason, the VyOS box itself sends a Reset to the google server. Makes. Absolutely. No. Sense. What. So. Ever.

But why? According to my config (and ip rule and ip route outputs) it has to work!
There is definitely a catch-all rule for table 170 which definitely contains the correct default route to the correct gateway

As tcp reset suggests SYN-ACK can’t be handled…Does each VRF have its own conntrack table?

No, it does not have its own conntrack table. The conntrack table is global and shared across all VRFs

There is zones when vrf is used

vyos@r14:~$ show conntrack table ipv4 
Id          Original src          Original dst       Reply src          Reply dst             Protocol    State        Timeout    Mark    Zone
----------  --------------------  -----------------  -----------------  --------------------  ----------  -----------  ---------  ------  ------
3280264903  192.168.122.14:38775  95.85.21.89:123    95.85.21.89:123    192.168.122.14:38775  udp                      10         0
464645856   192.168.100.254       1.1.1.1            1.1.1.1            192.168.100.254       icmp                     2          0       1010

Maybe this issue ⚓ T3655 NAT Problem with VRF

Oh no, please tell me I did not hit a bug before even getting started :frowning: :frowning:
Indeed this somewhat looks like the issue linked.

Can you help me understand how this can be related to conntrack though? First of all, why does ICMP work? And second, conntrack is just a “convenience” feature for stateful connection tracking. It is not required, neither for a basic firewall nor for routing.

With respect to Linux policy routing, shoudln’t conntrack only get into the picture when I am using it wo mark packets and then have rules that match those tagged packets via fwmark?

EDIT: Oh I see, conntrack for NAT you mean. set nat source rule 100 outbound-interface 'eth0.2' and set nat source rule 100 translation address 'masquerade' config I guess? Still, my question is up … why ICMP and UDP (DNS) works which response should rely exactly the same way on connection tracking

Conntrack table is consulted on return packet, to figure out where to send it to. (original packet was masqueraded, the table refers to original source)
Conntrack for DNS and ICMP has different logic, it boils down to just poke a hole.
Conntrack for TCP is more advanced, it has more states. Seems like the bug only affects tcp

Now that I think about it, where does even SNAT come into play here? It would only apply if the origin of the packet is from the LAN and then forwarded by VyOS.
But in this case the packet is generated by VyOS itself. “-t nat -N POSTROUTING -j MASQUERADE” never matches. The packet is just sent out via eth0.2. Original packet was not masqueraded.

Indeed, when I call sudo conntrack -L -n (=source routing table) while establishing a connection it does not show up.

Also the return packet (SYN-ACK) is received successfully. It shows up in the tcpdump. But then it is VyOS which generates a RST (which is sent back to and received by the server).

It doesn’t make any sense, even where a bug could happen here.

To make sure, I have completely removed NAT:

interfaces {
    ethernet eth0 {
        vif 2 {
            address dhcp
            description sonic
            vrf vrf_sonic
        }
        vif 3 {
            address 10.227.79.2/24
        }
    }
    loopback lo {
    }
}
policy {
    local-route {
        rule 101 {
            destination 0.0.0.0/0
            set {
                table local
            }
        }
        rule 102 {
            destination 0.0.0.0/0
            set {
                table main
            }
        }
        rule 104 {
            destination 0.0.0.0/0
            set {
                table 170
            }
        }
    }
}
vrf {
    name vrf_sonic {
        table 170
    }
}

Exactly as before, ICMP/ping works but TCP/UDP.

Note, UDP only worked because I had set vrf bind-to-all. But that’s actually not what I want. I still want all local sockets to be in the main VRF.

Command set vrf bind-to-all, as you said, is necessary for dns resolution from router. Otherwise, DNS query leaves the router, but answer is denied by the router itself.

Then, simplified config in my lab were icmp, dns resolution and tcp connections work:

set interfaces ethernet eth0 vif 2 address 'dhcp'
set interfaces ethernet eth0 vif 2 vrf 'vrf_sonic'
set policy local-route rule 101 destination '0.0.0.0/0'
set policy local-route rule 101 set table '170'
set vrf bind-to-all
set vrf name vrf_sonic table '170'

vyos@vyos:~$ show int
Codes: S - State, L - Link, u - Up, D - Down, A - Admin Down
Interface        IP Address                        S/L  Description
---------        ----------                        ---  -----------
eth0             -                                 u/u  
eth0.2           198.51.100.60/24                  u/u  
eth1             -                                 u/u  
eth2             -                                 u/u  
eth3             -                                 u/u  
lo               127.0.0.1/8                       u/u  
                 ::1/128                                
vyos@vyos:~$ show ip route table 170
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

VRF default table 170:
S>* 0.0.0.0/0 [210/0] via 198.51.100.1, eth0.2, weight 1, 00:12:49
C>* 198.51.100.0/24 is directly connected, eth0.2, 00:12:49
vyos@vyos:~$ 

Then, connectivity test:

vyos@vyos:~$ ping www.google.com count 2
PING www.google.com (142.251.133.4) 56(84) bytes of data.
64 bytes from eze10s01-in-f4.1e100.net (142.251.133.4): icmp_seq=1 ttl=116 time=26.5 ms
64 bytes from eze10s01-in-f4.1e100.net (142.251.133.4): icmp_seq=2 ttl=116 time=26.2 ms

--- www.google.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 26.243/26.365/26.487/0.122 ms
vyos@vyos:~$
vyos@vyos:~$ curl www.google.com
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="es-419"><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop="image"><title>Google</title><script nonce="bLPVSI1piuftl5COa0qlLw">(function(){var _g={kEI:'9v-CZLSIB9uJ5OUPmrmnqAs',kEXPI:'0,1359409,6059,206,4804,2316,383,246,5,1129120,1704,1196009,688,380090,16114,28684,22430,1362,284,12033,17582,4998,1
....
....

Whoah crazy, this is exactly what I did! I have downloaded now 1.4-rolling and lo and behold, it does seem to work with vrf bind-to-all and v1.4.
Are you using 1.3 or 1.4rolling? Can you confirm that the exact same config fails for you in 1.3?

It’s 1.4
Right now I can’t provide feedback of this setup in 1.3… I’ll try to check it later

@exp wrote:

But in this case the packet is generated by VyOS itself. “-t nat -N POSTROUTING -j MASQUERADE” never matches. The packet is just sent out via eth0.2. Original packet was not masqueraded.

Indeed, when I call sudo conntrack -L -n (=source routing table) while establishing a connection it does not show up.

That’s not how I think it works:
All traffic is being masqueraded, even from VyOS itself when it already has WAN IP address as source. Your masq rule has no source IP filter, so it also matches traffic sourced from VyOS.
In conntrack you should see a translation entry, where addresses aren’t translated.
Also, to make sure a translation isn’t gone before you can even run conntrack -L, connect to some unused foreign port, then conntrack should show an entry SYN_SENT [UNREPLIED]

Moreover, removing NAT rules will still make conntrack kick in. Conntrack table is used for statefull firewalling as well

Did you get the expected behavior in 1.4?
But it doesn’t work in 1.3?

Yes, the issue seems to appear only in 1.3. 1.4-rolling seems to work.