VPNs beyond the router are not working

I tried both Openconnect and l2tp/IPsec. The VPNs connect, and I can ping the Vyos VM., but I can’t get to other devices on the LAN. It looks like the router is not ARPing for the VPN client, so devices on the LAN subnet are not finding the client…

Sniffing the LAN side on the router with TCPdump. In this case, the VPN client is 172.16.250.240, the router is 172.16.250.1, and another VM is running at 172.16.250.20. The ping from the VPN client is getting out on the LAN, and the VM at .20 is hearing it. The VM at .20 sends out a who-has request that never gets answered…

13:46:21.744875 IP 172.16.250.240 > [172.16.250.20](http://172.16.250.20/): ICMP echo request, id 45628, seq 0, length 64 13:46:21.745145 ARP, Request who-has 172.16.250.240 tell 172.16.250.20, length 28
13:46:22.751916 IP 172.16.250.240 > [172.16.250.20](http://172.16.250.20/): ICMP echo request, id 45628, seq 1, length 64
13:46:22.764430 ARP, Request who-has 172.16.250.240 tell 172.16.250.20, length 28
13:46:23.753250 IP 172.16.250.240 > [172.16.250.20](http://172.16.250.20/): ICMP echo request, id 45628, seq 2, length 64
13:46:23.788554 ARP, Request who-has 172.16.250.240 tell 172.16.250.20, length 28
13:46:24.754572 IP 172.16.250.240 > [172.16.250.20](http://172.16.250.20/): ICMP echo request, id 45628, seq 3, length 64
13:46:25.761567 IP 172.16.250.240 > [172.16.250.20](http://172.16.250.20/): ICMP echo request, id 45628, seq 4, length 64
13:46:25.761789 ARP, Request who-has 172.16.250.240 tell 172.16.250.20, length 28
^C
21 packets captured
21 packets received by filter

I should follow up that this is a rolling release.
$ show version
Version: VyOS 1.4-rolling-202209090217
Release train: sagitta

Built by: [email protected]
Built on: Fri 09 Sep 2022 02:17 UTC
Build UUID: 98de53cd-c1f9-464e-8d1f-96174bc6fc12
Build commit ID: 9f0ab18e71ee1d

Architecture: x86_64
Boot via: installed image
System type: KVM guest

Hardware vendor: QEMU
Hardware model: Standard PC (i440FX + PIIX, 1996)
Hardware S/N:
Hardware UUID: 7e723611-902f-4d87-b4f1-939cdddf30d6

Without the config I guess that you tried to create one big subnet, the VM thinks it can get to the VPN cliënt through the local LAN since it is the same subnet. So, it starts ARPing instead of sending traffic to the router.

The VPN client probably has a PtP connection so it knows it has to send the packets to the router.

You either have to put the VPN client in a different subnet, or do some proxyarp trickery on the LAN side.

Does both ends (past the VPN endpoints) know how to get to the other side (IE is the routing tables for the device know to communicate to the routers for going across the VPN).

If you network is something like. the below… Does the Computer know to communicate to Router A to reach the network for the Server?

Computer → Router A → Internet → Router B → Server

The client is getting 172.16.250.240, an address from the LAN that the Vyos router is on. It also has the route entry of 172.16.250.0/24 to know how to get to that LAN prefix on the Vyos router. You can see in the TCP dump that it is trying to reach another device on the same VLAN at 172.16.250.20. A packet is getting to 172.16.250.20 from 172.16.250.240, and the 172.16.250.20 device is asking a “who-has” to add it to it’s ARP table but the Vyos box is not responding. If it did, the 172.16.250.20 box would be able to reach the VPN client at 172.16.250.240.

Here is an example of a successful connection on a UI Edgerouter…
16:22:21.240048 IP 100.64.4.241.53974 > 100.64.4.15.ssh: Flags [SEW], seq 3993998026, win 65535, options [mss 1240,nop,wscale 6,nop,nop,TS val 2836621081 ecr 0,sackOK,eol], length 0
16:22:21.240245 ARP, Request who-has 100.64.4.241 tell 100.64.4.2, length 46
16:22:21.732566 ARP, Reply 100.64.4.241 is-at 74:ac:b9:d4:00:13 (oui Unknown), length 28
16:22:21.732690 IP 100.64.4.15.ssh > 100.64.4.241.53974: Flags [S.E], seq 127563325, ack 3993998027, win 28960, options [mss 1460,sackOK,TS val 3847088517 ecr 2836621081,nop,wscale 7], length 0

In this case the LAN subnet is 100.64.4/24. The EdgeOS router is at 100.64.4.1. The VPN client got 100.64.4.241. The device the VPN client is tying to SSH into is 100.64.4.15. You can see the VPN client start the SSH TCP session with 100.64.4.15 in the first packet captured. The 100.64.4.15 sends out a who-has and there is a is-at reply that 100.64.4.241 74:ac:b9:d4:00:13. This is the MAC address that is used on the LAN interface on the Edgerouter. So both 100.64.4.1 and 100.64.4.241 have the same MAC address. After the “is-at” packet goes out, the SSH session starts.

Vyos is not sending out the is-at packet telling devices where to find the VPN client.

Tim

It’s not up to the router to be responding for ARP questions within the same subnet. Hence, you need proxyarp.

Yes… A workaround for this bug I did was to create a bridge interface with a separate subnet and assign VPN clients out of it. This works as it avoids the whole ARP process. So long as the devices the VPN client is trying to reach have a default route to the Vyos router, then that works in trying to reach the devices on the LAN side.

So I wouldn’t say this is “fixed” as the bug still exists.

So why does this work on the Edgerouter that is configured exactly the same way as the Vyos router? Why does the ER respond with the ARP request for the VPN client and the Vyos box does not?

I was on my phone before, but looking at your description and what happens on the capture is exactly what proxyarp does. If you do something like:
set interfaces ethernet eth1 ip enable-proxy-arp
your problem will probably go away. Just replace eth1 with the LAN interface.

The edgerouter probably enables proxyarp by default in this setup.

@roedie That did it! Thanks much.

Yes. My EdgeOS config does not have the enable-proxy-arp in it. So I guess it defaults to this.

With this, I would consider this closed.

Thanks again

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.