I’m currently facing an issue where a couple of devices on my LAN are unable to traverse the CG-NAT provided by Starlink. To resolve this, I need to set up a secondary gateway on my LAN with the IP address 10.0.0.2. This gateway should route all traffic through a Linode instance running VyOS (I can do Docker or bare-VM, but was leaning towards Docker), which will act as the public-facing endpoint.
My goal is to ensure that inbound and outbound traffic bypasses the CG-NAT with a public IP, allowing seamless communication for these devices using WireGuard.
Are there any comprehensive guides or examples available that would walk me through setting up VyOS on both ends for this site-to-site VPN configuration?
Have you looked into any overlay solutions like NetBird, Tailscale, Nebula, etc…? They pretty much all implement some kind of STUN like capability to get around tricky NAT while maintaining direct site-to-site connections.
I was under the impression that these would not offer a gateway option. For example, I have to statically assign a couple of trouble devices via DHCP with a static assignment for the same network configuration (subnet, DNS, etc.) except for a different Gateway server (10.0.0.2 vs the normal 10.0.0.1). This is not about privacy or security, just about bypassing the CG-NAT headache.
The goal and what I am going to do is to utilize both the normal and an additional gateway on my network.
Currently 10.0.0.1 goes over Starlink with a CG-NAT. This is not going to change - I am not going to move to their static IP solution since the cost is impossible to afford - and Starlink is literally the only ISP option available here.
Therefore I need a second gateway on the network that will tunnel traffic for whichever devices on my LAN are set to use it. I do not want all devices to use this gateway, just the ones I determine should use it.
I am dealing with some very, very “dumb” old devices that are unable to accept static IP assignment via their firmware. These are very expensive and very inflexible machines and the manufacturers will not do anything other than accept my cooperation to make this work. They have to receive a basic TCP/IP v4 IP setup of the IP, subnet, DNS and of course router/gateway.
Therefore 10.0.0.2 will be VyOS (or something else if that is possible?) that tunnels to another VPN endpoint server on Linode with a public IP and all traffic that goes through 10.0.0.2 will come out of and go into the Linode endpoint and tunnel back to the LAN gateway…
I’m sorry if this did not seem clear initially, does this help to clarify it for you?
Yeah I think I understand what you’re trying to do. Are you saying you’re currently only using Starlink, and not VyOS? And you’re looking to connect VyOS off of Starlink?
If that’s the case, all you need is another default route towards that other node (in additional to Starlink), and then you can use policy based routing to forward traffic over whichever path you wanted. You can use those overlay services for that if you wanted. I mention how to do that with NetBird towards the end of this thread: https://forum.vyos.io/t/article-using-netbird-for-site-to-site-routing-on-vyos/14747/5
You’d add a second default route in a different table via WireGuard or something, and just selectively forward traffc via Policy Based Routing. Just make sure you do static mappings for the DHCP leases of those devices. You’d need to create a new LAN subnet if Starlink is issuing IPs in the 10.0.0.0/24 subnet. You’d have something like this:
Starlink WAN: 10.0.0.0/24
WireGuard to remote site: <some subnet, can be a /30>
So currently I’m using - and will continue to use - a Ubiquiti ER-X for 10.0.0.1. I’m going to set up a secondary VyOS gateway at 10.0.0.2 that is capable of then using WireGuard to connect to a sibling static IP endpoint VyOS on Linode to be the gateway and to “circumvent” the CG-NAT on Starlink. So, technically, the VyOS on the LAN at 10.0.0.2 will use the ER-X at 10.0.0.1 as its gateway since the Starlink Modem is using an ethernet adapter and has its own router functionality turned off or in passthrough mode to the ER-X.
Does your switch (assuming you’re using one since you mentioned multiple devices) support VLANs? It’d be far easier to just have those devices on a different subnet. Also both the ER-X and VyOS can do what you’re trying to do without the other, so there’s not really any benefit to using both. The same device that connects to Starlink (whether it’s the ER-X or VyOS) can handle it all.
No, no VLANs sadly. It is a older setup and must be maintained as-is.
I actually operate another LAN with this same style setup on the same subnet, but it’s an “easier” privacy VPN via pfSense at another site. The two gateway arrangement works rather well and seems to have low/no impact. I just cannot use pfSense in this situation due to some limitations I have, so here I am with what appears to be the best gateway/router option on a Linux kernel…
Gotcha! You can still do it with a single device and PBR if you wanted. Just match the specific IPs of those devices you wanted to use the WireGuard tunnel. That’d be the best way to do it since you wouldn’t be creating a larger failure domain without benefit.
If you still wanted to use both and have a different gateway for some devices, you’d need to update the routers option for a static mapping in a DHCP pool, which I know is supported on VyOS, but not sure if that’d be supported on ER-X. On VyOS it’d look like this:
set service dhcp-server shared-network-name test subnet 10.0.0.0/24 static-mapping test static-mapping-parameters "option routers 10.0.0.2;"
Then you could just create a site-to-site tunnel using WG between that node and something off of linode.
So I’m going to use the ER-X to manage DHCP and assign the gateway so VyOS doesn’t need to worry about serving DHCP at all.
Here’s how I’ll do that on the ER-X, enter the EdgeRouter CLI:
configure
set service dhcp-server shared-network-name LAN subnet 10.0.0.0/24 static-mapping system-1 static-mapping-parameters "option routers 10.0.0.2"
commit
save
exit
You could certainly use something like NetBird for this. You’d just make the host in Linode be the exit node. It is kind of the easy button for setting it up, but if you’re only going to have a single site-to-site tunnel, it may be overkill for you considering you can just configure a single WireGuard tunnel. If you are going to have additional nodes later that need to leverage WireGuard (other site-to-site nodes or Remote Access nodes), then NetBird could give you a scalable solution.
The biggest benefit to something like NetBird is the STUN function when you have tricky NAT solutions. For instance I have a site-to-site VPN between 2 hosts that are each behind CG-NAT. This would be impossible without some kind of STUN service.
Thank you. I do like the idea of having flexibility down the road for other scenarios. One reason I wanted to go with a pfSense like solution, but too many barriers with other constraints to make it happen.
Are you aware of a more robust/comprehensive guide anywhere to kinda walk through the entire process since VyOS is new to me? It does seem like it should be simple, I just don’t want to hit any gotchas to hang me up once I jump onto the path and have a kind of deadline to implement this.
If you just want to do a manual WireGuard tunnel, then you may have to piecemeal different sections of VyOS’ documentation. A generic summarized steps for doing that on VyOS would be:
Connect an interface to Starlink’s LAN
Add 10.0.0.2 IP to the interface
Create a static default to 10.0.0.1
Generate WireGuard keys
Build WireGuard config for site-to-site full tunnel (allowed-ip 0.0.0.0/0)
Create 2 static routes to Linode WireGuard IP (0.0.0.0/1 and 128.0.0.0/1). You may need to
Create a static host route to the public IP of Linode. These 2 route will be more specific than the default to 10.0.0.1.
If you do a lot of CLI config on the ER, then most of that won’t be too foreign to you. There’s very minor syntactical differences between the 2. One thing to keep in mind is that VyOS doesn’t support WireGuard peers based on FQDN, so if the Linode IP may change frequently, that could be another reason for NetBird.
EDIT:
One more thing to note, VyOS is very lightweight so building all of this in a lab would be pretty easy if you wanted to iron out everything before actually trying to implement it.
Thank you - I will work on ingesting this. Good news is that with this being a second gateway, there’s more-or-less no downtime potential here. It might only happen if I fat-finger something on ER-X, which is super unlikely. Lol… I’m definitely going for simplicity where possible to make this easy to maintain and update, but simultaneously having the flexibility in place to do more later.
In your initial article you are using the VyOS Docker facility to host the NetBird container. If I have VyOS running on Ubuntu 24.04 via Docker, how would you suggest I run NetBird as a sibling thereof in terms of modification of your instructions?
Assuming you’re doing host networking for the VyOS docker, you’d just run the NetBird docker in Ubuntu, also with host networking. That’d give the VyOS docker access to NetBird.
I haven’t tested it with how you’re doing it, so there could be oddities there you may need to work through.
Been having a rough time following your guide in conjunction with getting this to work with the vyos-1.5-rolling-202407*-amd64.iso releases of VyOS on Linode. I thought I’d simplify the situation a bit by just putting it on Linode directly and following your guide as closely as possible. Linode’s Custom Linux Distribution Guid is pretty straight forward and VyOS seems to work pretty much out of the box following this (excluding putting it onto its own ext4 volume to maximize Linode features).
However, I’m running into some walls that seem to perhaps be specific to 1.5 versus earlier versions. I had to modify the static IP assignment a bit from other instructions. Then when I got to the NetBird Docker steps, I got the image, but the steps just don’t work out and I ultimately bump into “Container image for “nb1” is mandatory!” while various commit/save attempts don’t work.
Any chance you could review and modify your guides to work on more recent releases of VyOS and in conjunction with being used properly with Linode?
The main difference between 1.4 and 1.5 for containers will be the the cap-add section, which is called capability in 1.5. Everything else will work the same between versions. From what you’re saying, it says you are not putting the image command in, or it is not formatted correctly. Can you show what you’re trying to put in for the config and any errors you’re seeing?