Hi all,
I am encountering some strange issues on my IPSec site-to-site connectivity.
The setup is “not so complex”:
A VyOS instance on AWS, acting as a VPN Hub, and another VyOS instance (behind NAT and with dynamic public IP address) connecting to it.
I’m using IKEv2, with ESPoUDP, ID based auth with PSK.
My IPSec SA has 2 /32 IP addresses (of dummy interfaces), which I use to create a GRE tunnel over IPSec.
Everything works fine, on the pure reachability part of the story.
IPSec SA are UP, GRE tunnel is UP, traffic is flowing correctly.
But, after some hours of activity, I notice a high number of IPSec CHILD SA in a “stale” state - not being used, and the number keeps growing.
The ID of the CHILD SA are changing, so it seems that, during a rekey/reauth, multiple “cloned” CHILD are being created.
I.e.:
root@Cloud-RT-VyOS:~# sudo ipsec statusall
Status of IKE charon daemon (strongSwan 5.9.1, Linux 5.10.124-amd64-vyos, x86_64):
uptime: 27 hours, since Aug 01 08:35:58 2022
malloc: sbrk 2965504, mmap 0, used 1565376, free 1400128
worker threads: 11 of 16 idle, 5/0/0/0 working, job queue: 0/0/0/0, scheduled: 4
loaded plugins: charon test-vectors ldap pkcs11 tpm aesni aes rc2 sha2 sha1 md5 mgf1 rdrand random nonce x509 revocation constraints pubkey pkcs1 pkcs7 pkcs8 pkcs12 pgp dnskey sshkey pem openssl gcrypt af-alg fips-prf gmp curve25519 agent chapoly xcbc cmac hmac ctr ccm gcm drbg curl attr kernel-netlink resolve socket-default connmark stroke vici updown eap-identity eap-aka eap-md5 eap-gtc eap-mschapv2 eap-radius eap-tls eap-ttls eap-tnc xauth-generic xauth-eap xauth-pam tnc-tnccs dhcp lookip error-notify certexpire led addrblock counters
Listening IP addresses:
172.24.2.5
Connections:
peer_edge: 172.24.2.5...0.0.0.0/0 IKEv2, dpddelay=10s
peer_edge: local: [cloud] uses pre-shared key authentication
peer_edge: remote: [edge] uses pre-shared key authentication
peer_edge_tunnel_0: child: 10.255.255.1/32 === 10.255.255.2/32 TUNNEL, dpdaction=clear
Security Associations (1 up, 0 connecting):
peer_edge[10]: ESTABLISHED 4 hours ago, 172.24.2.5[cloud]...1XX.1XX.3X.1X[edge]
peer_edge[10]: IKEv2 SPIs: 01c7441f4d37aaa9_i b418db5bccc2b551_r*, rekeying in 2 hours
peer_edge[10]: IKE proposal: AES_CBC_128/HMAC_SHA1_96/PRF_HMAC_SHA1/MODP_1024
peer_edge_tunnel_0{136}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: c7e05991_i ca7a0616_o
peer_edge_tunnel_0{136}: AES_CBC_128/HMAC_SHA1_96/MODP_1024, 0 bytes_i (0 pkts, 89s ago), 0 bytes_o (0 pkts, 88s ago), rekeying in 44 minutes
peer_edge_tunnel_0{136}: 10.255.255.1/32 === 10.255.255.2/32
peer_edge_tunnel_0{138}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: c650f953_i c7ff89c4_o
peer_edge_tunnel_0{138}: AES_CBC_128/HMAC_SHA1_96/MODP_1024, 0 bytes_i (0 pkts, 89s ago), 0 bytes_o (0 pkts, 88s ago), rekeying in 44 minutes
peer_edge_tunnel_0{138}: 10.255.255.1/32 === 10.255.255.2/32
peer_edge_tunnel_0{137}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: c0ae5a72_i c2692b9e_o
peer_edge_tunnel_0{137}: AES_CBC_128/HMAC_SHA1_96/MODP_1024, 0 bytes_i (0 pkts, 89s ago), 0 bytes_o (0 pkts, 88s ago), rekeying in 44 minutes
peer_edge_tunnel_0{137}: 10.255.255.1/32 === 10.255.255.2/32
peer_edge_tunnel_0{140}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: c6b911d6_i c426486a_o
peer_edge_tunnel_0{140}: AES_CBC_128/HMAC_SHA1_96/MODP_1024, 0 bytes_i (0 pkts, 89s ago), 0 bytes_o (0 pkts, 88s ago), rekeying in 44 minutes
peer_edge_tunnel_0{140}: 10.255.255.1/32 === 10.255.255.2/32
peer_edge_tunnel_0{139}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: ce1fdf37_i c9d6fd99_o
peer_edge_tunnel_0{139}: AES_CBC_128/HMAC_SHA1_96/MODP_1024, 0 bytes_i (0 pkts, 89s ago), 0 bytes_o (0 pkts, 88s ago), rekeying in 44 minutes
peer_edge_tunnel_0{139}: 10.255.255.1/32 === 10.255.255.2/32
peer_edge_tunnel_0{142}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: cd3701ad_i c62fc19e_o
peer_edge_tunnel_0{142}: AES_CBC_128/HMAC_SHA1_96/MODP_1024, 0 bytes_i (0 pkts, 89s ago), 0 bytes_o (0 pkts, 88s ago), rekeying in 44 minutes
peer_edge_tunnel_0{142}: 10.255.255.1/32 === 10.255.255.2/32
peer_edge_tunnel_0{141}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: cda18266_i c1b7e8ec_o
peer_edge_tunnel_0{141}: AES_CBC_128/HMAC_SHA1_96/MODP_1024, 0 bytes_i (0 pkts, 89s ago), 0 bytes_o (0 pkts, 88s ago), rekeying in 44 minutes
peer_edge_tunnel_0{141}: 10.255.255.1/32 === 10.255.255.2/32
peer_edge_tunnel_0{144}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: c4fb4768_i c38ff61d_o
peer_edge_tunnel_0{144}: AES_CBC_128/HMAC_SHA1_96/MODP_1024, 0 bytes_i (0 pkts, 89s ago), 0 bytes_o (0 pkts, 88s ago), rekeying in 44 minutes
peer_edge_tunnel_0{144}: 10.255.255.1/32 === 10.255.255.2/32
peer_edge_tunnel_0{143}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: c10c0e20_i c3439dea_o
peer_edge_tunnel_0{143}: AES_CBC_128/HMAC_SHA1_96/MODP_1024, 0 bytes_i (0 pkts, 89s ago), 0 bytes_o (0 pkts, 88s ago), rekeying in 44 minutes
peer_edge_tunnel_0{143}: 10.255.255.1/32 === 10.255.255.2/32
peer_edge_tunnel_0{145}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: c7047fa7_i c0863c11_o
peer_edge_tunnel_0{145}: AES_CBC_128/HMAC_SHA1_96/MODP_1024, 0 bytes_i (0 pkts, 89s ago), 0 bytes_o (0 pkts, 88s ago), rekeying in 44 minutes
peer_edge_tunnel_0{145}: 10.255.255.1/32 === 10.255.255.2/32
peer_edge_tunnel_0{146}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: c3e4822d_i cf512256_o
peer_edge_tunnel_0{146}: AES_CBC_128/HMAC_SHA1_96/MODP_1024, 0 bytes_i (0 pkts, 89s ago), 0 bytes_o (0 pkts, 88s ago), rekeying in 44 minutes
peer_edge_tunnel_0{146}: 10.255.255.1/32 === 10.255.255.2/32
peer_edge_tunnel_0{147}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: c399de54_i c9471fbd_o
peer_edge_tunnel_0{147}: AES_CBC_128/HMAC_SHA1_96/MODP_1024, 0 bytes_i (0 pkts, 89s ago), 0 bytes_o (0 pkts, 88s ago), rekeying in 44 minutes
peer_edge_tunnel_0{147}: 10.255.255.1/32 === 10.255.255.2/32
peer_edge_tunnel_0{148}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: cf67a002_i c9c7053f_o
peer_edge_tunnel_0{148}: AES_CBC_128/HMAC_SHA1_96/MODP_1024, 0 bytes_i (0 pkts, 89s ago), 0 bytes_o (0 pkts, 88s ago), rekeying in 44 minutes
peer_edge_tunnel_0{148}: 10.255.255.1/32 === 10.255.255.2/32
peer_edge_tunnel_0{149}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: c1ac97e1_i c416e812_o
peer_edge_tunnel_0{149}: AES_CBC_128/HMAC_SHA1_96/MODP_1024, 0 bytes_i (0 pkts, 89s ago), 0 bytes_o (0 pkts, 88s ago), rekeying in 44 minutes
peer_edge_tunnel_0{149}: 10.255.255.1/32 === 10.255.255.2/32
peer_edge_tunnel_0{150}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: ccc87d27_i c426ff55_o
peer_edge_tunnel_0{150}: AES_CBC_128/HMAC_SHA1_96/MODP_1024, 0 bytes_i (0 pkts, 89s ago), 0 bytes_o (0 pkts, 88s ago), rekeying in 44 minutes
peer_edge_tunnel_0{150}: 10.255.255.1/32 === 10.255.255.2/32
peer_edge_tunnel_0{151}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: ca29fc90_i c4564789_o
peer_edge_tunnel_0{151}: AES_CBC_128/HMAC_SHA1_96/MODP_1024, 0 bytes_i (0 pkts, 89s ago), 0 bytes_o (0 pkts, 88s ago), rekeying in 44 minutes
peer_edge_tunnel_0{151}: 10.255.255.1/32 === 10.255.255.2/32
peer_edge_tunnel_0{152}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: c8c42c10_i c87dcbaa_o
peer_edge_tunnel_0{152}: AES_CBC_128/HMAC_SHA1_96/MODP_1024, 0 bytes_i (0 pkts, 89s ago), 0 bytes_o (0 pkts, 88s ago), rekeying in 44 minutes
peer_edge_tunnel_0{152}: 10.255.255.1/32 === 10.255.255.2/32
peer_edge_tunnel_0{153}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: c1f50b98_i c8be0ef3_o
peer_edge_tunnel_0{153}: AES_CBC_128/HMAC_SHA1_96/MODP_1024, 0 bytes_i (0 pkts, 89s ago), 0 bytes_o (0 pkts, 88s ago), rekeying in 44 minutes
peer_edge_tunnel_0{153}: 10.255.255.1/32 === 10.255.255.2/32
peer_edge_tunnel_0{154}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: c6415642_i c49c647d_o
peer_edge_tunnel_0{154}: AES_CBC_128/HMAC_SHA1_96/MODP_1024, 38763 bytes_i (342 pkts, 88s ago), 92862 bytes_o (342 pkts, 88s ago), rekeying in 44 minutes
peer_edge_tunnel_0{154}: 10.255.255.1/32 === 10.255.255.2/32
This results in memory usage continuosly increasing, as you can see from the attached graph.
I enabled strongswan verbose logging, and checked the events directly using sudo swanctl --log
, but I’m still not able to understand why the number of “stale” CHILD SA keeps growing.
I’ve also tried multiple combination of configurations, such as changing “uniqreqids”, “ikev2-reauth”, but the “problem” persists.
If you have any idea, or some config to try, it’s more than welcome.
This is my configuration (relevant parts only):
HUB (cloud):
set interfaces dummy dum0 address '10.255.255.1/32'
set interfaces tunnel tun11 address '10.255.254.1/30'
set interfaces tunnel tun11 encapsulation 'gre'
set interfaces tunnel tun11 remote '10.255.255.2'
set interfaces tunnel tun11 source-address '10.255.255.1'
set interfaces tunnel tun11 vrf 'INNER'
set vpn ipsec disable-uniqreqids
set vpn ipsec esp-group esp-vpn proposal 1 encryption 'aes128'
set vpn ipsec esp-group esp-vpn proposal 1 hash 'sha1'
set vpn ipsec ike-group ike-vpn close-action 'none'
set vpn ipsec ike-group ike-vpn dead-peer-detection action 'clear'
set vpn ipsec ike-group ike-vpn dead-peer-detection interval '10'
set vpn ipsec ike-group ike-vpn ikev2-reauth 'no'
set vpn ipsec ike-group ike-vpn key-exchange 'ikev2'
set vpn ipsec ike-group ike-vpn proposal 1 encryption 'aes128'
set vpn ipsec ike-group ike-vpn proposal 1 hash 'sha1'
set vpn ipsec interface 'eth0'
set vpn ipsec site-to-site peer @edge authentication id '@cloud'
set vpn ipsec site-to-site peer @edge authentication mode 'pre-shared-secret'
set vpn ipsec site-to-site peer @edge authentication pre-shared-secret 'supersecret'
set vpn ipsec site-to-site peer @edge authentication remote-id '@edge'
set vpn ipsec site-to-site peer @edge connection-type 'respond'
set vpn ipsec site-to-site peer @edge default-esp-group 'esp-vpn'
set vpn ipsec site-to-site peer @edge dhcp-interface 'eth0'
set vpn ipsec site-to-site peer @edge force-encapsulation 'enable'
set vpn ipsec site-to-site peer @edge ike-group 'ike-vpn'
set vpn ipsec site-to-site peer @edge tunnel 0 local prefix '10.255.255.1/32'
set vpn ipsec site-to-site peer @edge tunnel 0 remote prefix '10.255.255.2/32'
set vrf name INNER protocols static route 172.17.0.0/16 next-hop 10.255.254.2
set vrf name INNER table '100'
SPOKE (edge):
set interfaces dummy dum0 address '10.255.255.2/32'
set interfaces tunnel tun11 address '10.255.254.2/30'
set interfaces tunnel tun11 encapsulation 'gre'
set interfaces tunnel tun11 remote '10.255.255.1'
set interfaces tunnel tun11 source-address '10.255.255.2'
set interfaces tunnel tun11 vrf 'INNER'
set vpn ipsec disable-uniqreqids
set vpn ipsec esp-group esp-vpn proposal 1 encryption 'aes128'
set vpn ipsec esp-group esp-vpn proposal 1 hash 'sha1'
set vpn ipsec ike-group ike-vpn close-action 'hold'
set vpn ipsec ike-group ike-vpn dead-peer-detection action 'restart'
set vpn ipsec ike-group ike-vpn dead-peer-detection interval '10'
set vpn ipsec ike-group ike-vpn ikev2-reauth 'no'
set vpn ipsec ike-group ike-vpn key-exchange 'ikev2'
set vpn ipsec ike-group ike-vpn proposal 1 encryption 'aes128'
set vpn ipsec ike-group ike-vpn proposal 1 hash 'sha1'
set vpn ipsec interface 'eth0.6'
set vpn ipsec site-to-site peer 13.51.XX.XX authentication id '@edge'
set vpn ipsec site-to-site peer 13.51.XX.XX authentication mode 'pre-shared-secret'
set vpn ipsec site-to-site peer 13.51.XX.XX authentication pre-shared-secret 'supersecret'
set vpn ipsec site-to-site peer 13.51.XX.XX authentication remote-id '@cloud'
set vpn ipsec site-to-site peer 13.51.XX.XX connection-type 'initiate'
set vpn ipsec site-to-site peer 13.51.XX.XX default-esp-group 'esp-vpn'
set vpn ipsec site-to-site peer 13.51.XX.XX force-encapsulation 'enable'
set vpn ipsec site-to-site peer 13.51.XX.XX ike-group 'ike-vpn'
set vpn ipsec site-to-site peer 13.51.XX.XX local-address '172.24.1.2'
set vpn ipsec site-to-site peer 13.51.XX.XX tunnel 0 local prefix '10.255.255.2/32'
set vpn ipsec site-to-site peer 13.51.XX.XX tunnel 0 remote prefix '10.255.255.1/32'
set vrf name INNER protocols static route 172.24.2.0/24 next-hop 10.255.254.1
set vrf name INNER table '100'
I’m using 1.4 rolling 202207061206.