Routing question, how to use multiple ISP firewalls through transit network

Hello there VyOS guru’s,

I’ve been struggling with this question for a while now, so I thought lets see if there is an answer to be found in this forum. I can’t imagine I’m the only one trying to to this.

So here goes,
We have multiple ISP’s, each ISP has its own VyOS Firewall connected to public and transit

Transit has a number of VyOS router clusters connected to it, each represents a separate group. the clusters are connected to transit and vnetX. Now in the vNet there is a mail server (or something else), MX in the drawing attached. The routes in the transit are distributed through a simple OSPF (single area 0.0.0.0)

Now in DNS we set MX1,MX2 and MX3 to each different internet connection / firewall, and NAT that through to the MX in the vNet. Unfortunately incoming traffic from FW2 or FW3 it does not return, since the outbound route goes through FW1, hence for some reason the return traffic does not go back through the route it was originated from.

So my question is, how do I get this to work? what am I missing here.

I’v e sketched up a diagram to illustrate the above, any suggestions would be appreciated. thanks!

An excellent example of why NAT is evil. Don’t do that. NAT is not compatible with such a distributed architecture. Consider two simultaneous connections:

S → MX1
S → MX2

Let R1 be the router at 172.16.1.11
Let R2 be the router at 172.16.1.21
Let T1 be the router at 172.16.1.101 / 10.1.1.1
Let M be the mail server at 10.1.1.10, which is the NAT target of both MX1 and MX2.

You will have two independent sets of packets flowing thru T1 with source ip = S and destination ip = 10.1.1.10.

You will also have two independent reply streams flowing thru T1 with source ip = 10.1.1.10 and destination ip = S. How is T1 supposed to decide which of those reply packets should be sent to R1 and which should be sent to R2? There is nothing in those reply packets that can help make that decision.

It is even worse. Source S could have used the same source port number for those two connections. Suppose we represent tcp connections with (source ip, source port, destination ip, destination port). Then from the viewpoint of S, we have two outbound mail connections, (S, p1, MX1, 25) and (S, p1, MX2, 25). Those are natted by R1 and R2, so those connections are seen at T1 and M as (S, p1, 10.1.1.10, 25) and (S, p1, 10.1.1.10, 25). Now the connections conflict, and M will (should) send a reset packet and kill the connection.

Your problem is also asynchronous routing in your three firewalls. If you want to have stateful filtering or NAT in those firewalls the outbound and inbound traffic most flow through the same device.

If you know which firewall / ISP a inbound connection to one vnetX will come through you should route the outbound traffic through the same device.