Routing & NAT on Azure peered VNETs

I’m having a lot of trouble with routing on Azure. I’m trying to follow the “best practice” architecture for an internal corporate network using distinct VNETs for separate workloads, with a central “hub” VNET through which all traffic is routed. Each spoke network is peered to the central hub.

Azure has some services for this: a BGP route server, a firewall service, etc. These services are fairly expensive, however, and I would like to see if this can be achieved with VyOS. The trouble is that Azure requires use of DHCP for interfaces and inserts its own services between each peered network.

So in my setup, I’m using the following VNETs:

vnet-hub: 10.0.0.0/24

  • subnet-wan: 10.0.0.64/26
    • nic-vyos-wan (eth0): public IP and private dhcp IP: static 10.0.0.70
  • subnet-lan: 10.0.0.0/26
    • nic-vyos-lan (eth1): private dhcp IP: static 10.0.0.10

vnet-test: 10.0.1.0/24

  • subnet-test: 10.0.1.0/24
    • nic-pc: private dhcp IP: static 10.0.1.10

Both nic-vyos-* are connected to vm-vyos and nic-pc is connected to vm-pc.
I plan to assign future VNETs unused blocks from 10.0.0.0/16.

I have 2 subnets in the hub so I can simulate a “WAN” and “LAN” NIC on the VyOS router and prevent direct connectivity between the “WAN” NIC and other IPs in the corporate network. VyOS is configured to use DHCP for both NICs, (e.g. VyOS sees the LAN network as 10.0.0.0/64). In Azure I have set User Defined Routes for vnet-test:

0.0.0.0/0 -> 10.0.0.10
10.0.0.0/24 -> 10.0.0.10
10.0.1.0/24 -> 10.0.0.10

VyOS is configured with static routes:

protocols {
    static {
        route 0.0.0.0/0 {
            dhcp-interface eth0
        }
        route 10.0.1.0/24 {
            next-hop 10.0.0.1 {
                interface eth1
            }
        }
    }
}

As you can see, for VNET peering from vnet-hub I have to use 10.0.0.1 as next hop from subnet-lan and 10.0.0.65 for subnet-wan (which is provided by Azure DHCP… I am not using DHCP for route 10.0.1.0/24 because Azure DHCP doesn’t advertise that route without the Azure Route Server BGP service which is nearly $300/mo).

So far I can reach nic-pc from vm-vyos but I am having issues with SNAT from source block 10.0.1.0/24. I want to be able to masquerade all VNET internet traffic to nic-vyos-wan and intranet traffic to nic-vyos-lan. vm-pc does not currently have internet connectivity:

 nat {
     source {
         rule 100 {
             log
             outbound-interface any
             source {
                 address 10.0.1.0/24
             }
             translation {
                 address masquerade
             }
         }
     }
 }

I have seen some kernel logs saying that 10.0.1.10 is a “martian” address. Is this happening because VyOS doesn’t know it should be able to route traffic for 10.0.1.0/24? My initial thought is that this setup maybe requires use of Dummy interfaces, so I tried one for 10.0.1.0/24 but that installs a “C” (connected) route that overrides my static route.

I understand this is a very detailed question and Azure networking is ridiculous, but I am posting after struggling for many days with this.

By the way, the reason I have separate Azure UDR routes in vnet-test for the 10.0.0.0/24 and 10.0.1.0/24 blocks instead of simply the default route is because peering creates implicit routes for peered VNETs unless those blocks are over-ridden.

OK. My primary issue was with the Azure NSGs. It looks like the rules filter every packet as they leave or enter any NIC in the VNET. With allow all inbound and outbound rules on the Azure NSG, I have an SNAT configuration working in OPNsense.

I’m gonna try it again with VyOS, but has anybody tried a VyOS NAT that rewrites the source IP on response traffic to the router’s internal NIC? I would prefer not to have an Allow-all inbound NSG rule, in case a VM is accidentally assigned a public IP address (the default).

I see some talk online about this type of custom translation with Cisco or iptables, but it looks like the approach requires policy based routing. Here’s what I’ve found so far:

I am not a network engineer, so I would really appreciate some tips if you have better resources on this type of source IP translation (is there a standard name for it?) or thoughts about the policy based routing approach.

Great Solution ! I agree with cost of solution like Azure gateway or azure route server, although , talking about networking azure if the most powerful on Cloud . based a your last comment , it seems to be related a limitation of NVA on Azure , it needs to enable ip forwarding over the interfaces .

enable it all interfaces attached VyOS VM.

The fixed costs are too large. I haven’t compared the incremental costs to scaling a bare-metal network but when the time comes I’ll reconsider it. To get firewall on Azure is more $$$ of fixed costs and incremental data processing cost. That’s using vanilla azure. To integrate it with a nice commercial UI product the price for even a tiny cloud network will easily exceed $1k/mo.

Anyway I’m not here for people to tell me what commercial products are worth, I’m here to see what OSS can do.

You are correct about the Azure NICs requiring the port-forwarding setting, but I already had that. As I said in my last post, the problem was the Network Security Group on the Azure VNET. I have SNAT working now and I’m asking if there’s a way to do rewrite source addresses for packets as they come back into the network as responses to an established SNATted session.

if I understand it properly , it should work by changing the snat- mesquered interface :

 nat {
     source {
         rule 100 {
             log
             outbound-interface **eth0**
             source {
                 address 10.0.1.0/24
             }
             translation {
                 address masquerade
             }
         }
     }
 }

So any packet had received from LAN , rewrite source address to WAN (10.0.0.70) or if the idea is to use an ip address that is not present in the interfaces , you can create a dummy interfaces and used it to translated :

set interfaces dummy dum0 address '172.29.41.89/32'

set nat source rule 110 description 'Internal TEST'
set nat source rule 110 destination address '172.27.1.0/24'
set nat source rule 110 outbound-interface 'any'
set nat source rule 110 source address '192.168.43.0/24'
set nat source rule 110 translation address '172.29.41.89'

however , if the idea is only use the wan interface to reach internet with mesquered would be enough.