AWS-EdgeRouter VPN tunnel: uni-directional traffic

We are having an issue routing traffic over an IPsec VPN tunnel between an EdgeRouter Pro v2.0.6 and AWS. Traffic from AWS reaches the on-premise instance, but return traffic never makes it back to AWS. A new connection attempt from on-premise never makes it to AWS at all.

Local site: 192.168.150.0/23, 10.0.0.0/24, 192.168.1.0/24

Amazon AWS: 172.31.0.0/16

Here is what works:

  • I have wireshark running both on the on-premise instance 192.168.150.6 and AWS instance 172.31.43.22. When I ping from the AWS instance to the on-premise instance, I see the ICMP echo requests making it to on-premise and ICMP echo reply being sent back. However, the echo reply packet never shows up back on the AWS instance.
  • When I try to ping from the on-premise instance to AWS instance, I do not see any ICMP traffic make it to AWS at all.

Here is a capture from the UBNT EdgeRouter when I attempt to ping from on-premise to AWS:

ubnt@ubnt:~$ tail -f /var/log/messages | grep 172.31
Mar 18 22:05:48 ubnt kernel: [eth1.inbound-1-A]IN=eth1 OUT=eth0 MAC=f0:9f:c2:1b:e1:90:00:0c:29:a8:c8:8f:08:00 SRC=192.168.150.14 DST=172.31.43.22 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=48128 DF PROTO=ICMP TYPE=8 CODE=0 ID=19603 SEQ=6
Mar 18 22:05:49 ubnt kernel: [eth1.inbound-1-A]IN=eth1 OUT=eth0 MAC=f0:9f:c2:1b:e1:90:00:0c:29:a8:c8:8f:08:00 SRC=192.168.150.14 DST=172.31.43.22 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=48358 DF PROTO=ICMP TYPE=8 CODE=0 ID=19603 SEQ=7
Mar 18 22:05:50 ubnt kernel: [eth1.inbound-1-A]IN=eth1 OUT=eth0 MAC=f0:9f:c2:1b:e1:90:00:0c:29:a8:c8:8f:08:00 SRC=192.168.150.14 DST=172.31.43.22 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=48550 DF PROTO=ICMP TYPE=8 CODE=0 ID=19603 SEQ=8
Mar 18 22:05:52 ubnt kernel: [eth1.inbound-1-A]IN=eth1 OUT=eth0 MAC=f0:9f:c2:1b:e1:90:00:0c:29:6f:88:e8:08:00 SRC=192.168.150.6 DST=172.31.43.22 LEN=60 TOS=0x00 PREC=0x00 TTL=127 ID=26647 PROTO=ICMP TYPE=8 CODE=0 ID=1 SEQ=7627
Mar 18 22:05:52 ubnt kernel: [eth1.inbound-1-A]IN=eth1 OUT=eth0 MAC=f0:9f:c2:1b:e1:90:00:0c:29:a8:c8:8f:08:00 SRC=192.168.150.14 DST=172.31.43.22 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=48981 DF PROTO=ICMP TYPE=8 CODE=0 ID=19605 SEQ=1
Mar 18 22:05:52 ubnt kernel: [NAT-5002-EXCLUDE] IN= OUT=eth0 SRC=192.168.150.14 DST=172.31.43.22 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=48981 DF PROTO=ICMP TYPE=8 CODE=0 ID=19605 SEQ=1
Mar 18 22:05:53 ubnt kernel: [eth1.inbound-1-A]IN=eth1 OUT=eth0 MAC=f0:9f:c2:1b:e1:90:00:0c:29:a8:c8:8f:08:00 SRC=192.168.150.14 DST=172.31.43.22 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=48983 DF PROTO=ICMP TYPE=8 CODE=0 ID=19605 SEQ=2

As you can see, ‘NAT-5002-EXCLUDE’ indicates that the VPN traffic is being excluded from NAT.

However, the following log lines indicate that the traffic is then being routed to eth0 (rather than VTI0 as would be expected to send it out over the IPsec tunnel):

Mar 18 22:05:52 ubnt kernel: [NAT-5002-EXCLUDE] IN= OUT=eth0 SRC=192.168.150.14 DST=172.31.43.22 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=48981 DF PROTO=ICMP TYPE=8 CODE=0 ID=19605 SEQ=1

Mar 18 22:05:53 ubnt kernel: [eth1.inbound-1-A]IN=eth1 OUT=eth0 MAC=f0:9f:c2:1b:e1:90:00:0c:29:a8:c8:8f:08:00 SRC=192.168.150.14 DST=172.31.43.22 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=48983 DF PROTO=ICMP TYPE=8 CODE=0 ID=19605 SEQ=2

Mar 18 22:05:54 ubnt kernel: [eth1.inbound-1-A]IN=eth1 OUT=eth0 MAC=f0:9f:c2:1b:e1:90:00:0c:29:a8:c8:8f:08:00 SRC=192.168.150.14 DST=172.31.43.22 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=49217 DF PROTO=ICMP TYPE=8 CODE=0 ID=19605 SEQ=3

I have configured the NAT exclude on the inbound eth1 interface of the EdgeRouter and expect that the router will find an installed route to AWS over the IPsect VTI interface:

ubnt@ubnt:~$ show ip route
Codes: K - kernel, C - connected, S - static, R - RIP, B - BGP
O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2
> - selected route, * - FIB route, p - stale info

IP Route Table for VRF “default”
S *> 0.0.0.0/0 [210/0] via 192.168.1.1, eth0
C *> 10.0.0.0/24 is directly connected, eth2
C *> 10.111.128.0/21 is directly connected, eth1.111
C *> 71.163.106.2/32 is directly connected, eth1.200
C *> 127.0.0.0/8 is directly connected, lo
C *> 169.254.87.100/30 is directly connected, vti0
B *> 172.31.0.0/16 [20/100] via 169.254.87.101, vti0, 05:26:06
C *> 192.168.1.0/24 is directly connected, eth0
C *> 192.168.110.0/24 is directly connected, eth1.110
C *> 192.168.112.0/24 is directly connected, eth1.112
C *> 192.168.113.0/24 is directly connected, eth1.113
C *> 192.168.120.0/24 is directly connected, eth1.120
C *> 192.168.130.0/24 is directly connected, eth1.130
C *> 192.168.140.0/24 is directly connected, eth1.140
C *> 192.168.150.0/24 is directly connected, eth1

What is going on here? Why is the AWS bound traffic not taking the tunnel out?

Configuration:

  1. I have followed instructions here to configure VTI based tunnel with BGP for dynamic exchange of routes between the 2 sites:
    https://help.ui.com/hc/en-us/articles/115016128008

  2. Since this is a new configuration, I have only configured a single VPN tunnel for now (AWS provides 2 VPN endpoints but I wanted to get one tunnel functional before building out the second tunnel).

  3. The EdgeRouter is behind a NAT configured as a DMZ host.

  4. The IPsec tunnel is established successfully.
    ubnt@ubnt:~$ show vpn ipsec sa
    peer-34.234.25.246-tunnel-vti: #1, ESTABLISHED, IKEv1, 71eb1ee6aad567a3_i* 52e8162039ad61a2_r
    local ‘192.168.1.179’ @ 192.168.1.179[4500]
    remote ‘34.234.25.246’ @ 34.234.25.246[4500]
    AES_CBC-128/HMAC_SHA1_96/PRF_HMAC_SHA1/MODP_1024
    established 11065s ago, reauth in 17179s
    peer-34.234.25.246-tunnel-vti: #5, reqid 1, REKEYED, TUNNEL-in-UDP, ESP:AES_CBC-128/HMAC_SHA1_96/MODP_1024
    installed 3014s ago, rekeying in -182s, expires in 586s
    in cb8eea25 (0x00900002), 34913 bytes, 568 packets
    out c9b88aad (0x00900002), 62053 bytes, 752 packets
    local 192.168.150.0/24
    remote 172.31.0.0/16
    peer-34.234.25.246-tunnel-vti: #6, reqid 1, INSTALLED, TUNNEL-in-UDP, ESP:AES_CBC-128/HMAC_SHA1_96/MODP_1024
    installed 182s ago, rekeying in 2518s, expires in 3418s
    in c02e9e9f (0x00900002), 2214 bytes, 36 packets
    out cad9f8c0 (0x00900002), 3689 bytes, 46 packets
    local 192.168.150.0/24
    remote 172.31.0.0/16

  5. BGP peer relationship is running and routes are being propagated over the established VPN tunnel to AWS and on-site EdgeRouter.

ubnt@ubnt:~$ show ip bgp summary
BGP router identifier 192.168.150.1, local AS number 65000
BGP table version is 8
2 BGP AS-PATH entries
0 BGP community entries

Neighbor V AS MsgRcv MsgSen TblVer InQ OutQ Up/Down State/PfxRcd
169.254.87.101 4 64512 2829 2828 8 0 0 05:28:02 1

Total number of neighbors 1

Total number of Established sessions 1
ubnt@ubnt:~$ show ip bgp neighbors
BGP neighbor is 169.254.87.101, remote AS 64512, local AS 65000, external link
BGP version 4, remote router ID 169.254.87.101
BGP state = Established, up for 05:28:11
Last read 05:28:11, hold time is 30, keepalive interval is 10 seconds
Configured hold time is 30, keepalive interval is 10 seconds
Neighbor capabilities:
Route refresh: advertised and received (old and new)
4-Octet ASN Capability: advertised and received
Address family IPv4 Unicast: advertised and received
Received 2829 messages, 1 notifications, 0 in queue
Sent 2828 messages, 1 notifications, 0 in queue
Route refresh request: received 0, sent 0
Minimum time between advertisement runs is 30 seconds
For address family: IPv4 Unicast
BGP table version 8, neighbor version 8
Index 1, Offset 0, Mask 0x2
Inbound soft reconfiguration allowed
Community attribute sent to this neighbor (both)
1 accepted prefixes
3 announced prefixes

Connections established 2; dropped 1
Local host: 169.254.87.102, Local port: 179
Foreign host: 169.254.87.101, Foreign port: 40115
Nexthop: 169.254.87.102
Nexthop global: ::
Nexthop local: ::
BGP connection: non shared network
Last Reset: 05:28:14, due to BGP Notification received
Notification Error Message: (Cease/Connection Rejected.)
ubnt@ubnt:~$

Advertised and received routes over the tunnel:

ubnt@ubnt:~$ show ip bgp neighbors
169.254.87.101
ubnt@ubnt:~$ show ip bgp neighbors 169.254.87.101 advertised-routes
BGP table version is 8, local router ID is 192.168.150.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete

Network          Next Hop            Metric    LocPrf       Weight Path

*> 10.0.0.0/24 169.254.87.102 100 32768 i
*> 192.168.1.0 169.254.87.102 100 32768 i
*> 192.168.150.0 169.254.87.102 100 32768 i

Total number of prefixes 3
ubnt@ubnt:~$ show ip bgp neighbors 169.254.87.101 received-routes
BGP table version is 8, local router ID is 192.168.150.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete

Network          Next Hop            Metric    LocPrf       Weight Path

*> 172.31.0.0 169.254.87.101 100 0 64512 i

Total number of prefixes 1

Show ip route to the remote AWS network (AWS server 172.31.43.22) shows the hop is over the peer VTI interface:
ubnt@ubnt:~$ show ip route 172.31.43.22
Routing entry for 172.31.0.0/16
Known via “bgp”, distance 20, metric 100, External Route Tag: 64512, best
Last update 05:29:23 ago

  • 169.254.87.101, via vti0

Here are the commands that I ran to get to this point:
configure
set vpn ipsec auto-firewall-nat-exclude enable
set vpn ipsec ike-group AWS lifetime ‘28800’
set vpn ipsec ike-group AWS proposal 1 dh-group ‘2’
set vpn ipsec ike-group AWS proposal 1 encryption ‘aes128’
set vpn ipsec ike-group AWS proposal 1 hash ‘sha1’
set vpn ipsec site-to-site peer 34.234.25.246 authentication mode ‘pre-shared-secret’
set vpn ipsec site-to-site peer 34.234.25.246 authentication pre-shared-secret ‘xxxxxx’
set vpn ipsec site-to-site peer 34.234.25.246 description ‘VPC tunnel 1’
set vpn ipsec site-to-site peer 34.234.25.246 ike-group ‘AWS’
set vpn ipsec site-to-site peer 34.234.25.246 local-address ‘192.168.1.179’
set vpn ipsec site-to-site peer 34.234.25.246 vti bind ‘vti0’
set vpn ipsec site-to-site peer 34.234.25.246 vti esp-group ‘AWS’

set vpn ipsec ipsec-interfaces interface ‘eth0’
set vpn ipsec esp-group AWS compression ‘disable’
set vpn ipsec esp-group AWS lifetime ‘3600’
set vpn ipsec esp-group AWS mode ‘tunnel’
set vpn ipsec esp-group AWS pfs ‘enable’
set vpn ipsec esp-group AWS proposal 1 encryption ‘aes128’
set vpn ipsec esp-group AWS proposal 1 hash ‘sha1’

set vpn ipsec ike-group AWS dead-peer-detection action ‘restart’
set vpn ipsec ike-group AWS dead-peer-detection interval ‘15’
set vpn ipsec ike-group AWS dead-peer-detection timeout ‘30’

set interfaces vti vti0 address ‘169.254.87.102/30’
set interfaces vti vti0 description ‘VPC tunnel 1’
set interfaces vti vti0 mtu ‘1436’

set protocols bgp 65000 neighbor 169.254.87.101 remote-as ‘64512’
set protocols bgp 65000 neighbor 169.254.87.101 soft-reconfiguration ‘inbound’
set protocols bgp 65000 neighbor 169.254.87.101 timers holdtime ‘30’
set protocols bgp 65000 neighbor 169.254.87.101 timers keepalive ‘10’

set protocols bgp 65000 network 192.168.150.0/24
commit

Configuration extract from the router:
nat {
rule 5002 {
description “AWS VPN exclude from NAT”
destination {
address 172.31.0.0/16
}
exclude
log enable
outbound-interface eth0
protocol all
source {
address 192.168.150.0/24
}
type masquerade
}
rule 5004 {
description “masquerade for WAN”
destination {
}
log enable
outbound-interface eth0
protocol all
source {
group {
}
}
type masquerade
}
}

Here is the configuration on the EdgeRouter NAT section:
image

172.31.0.0/16 is on the NAT exclude list.

Do you have any dnat rules or policy routes?
Try to add a static route via vtiX.

I do not have any DNAT or Policy routes. I have tried adding static route via VTI0, also the next-hop but this does not work either.

So traffic goes over eth0?
Maybe some bug with firmware.
Did you try to reboot egderouter? Not sure if it is possible to check real nat rules on it.
for Example

sudo iptables -L -n -v -t nat
sudo iptables-save