Multiple IPSec Tunnels not coming up

I having an issue with IPsec policy routing. If I have two policy tunnels configured for 1 peer, the second tunnel is ignored and never initialized. I also have the issue that the first tunnel will not renew. I basically copied a working configuration from edgeos which has a common ancestor to vyos so I don’t think it is the configuration. If I delete the 1st tunnel than the 2nd tunnel is brought up. I use a couple of source nat rules to trigger each tunnel so I’m able to test each tunnel individually. I found the following thread that is a couple years old that looks like the same problem which was fixed for vyos 1.3 and 1.4. Since I’m using the latest rolling release and its now 2-3 years later I decided to make a new thread.

sam@cr:~$ show system image
Name                      Default boot    Running
------------------------  --------------  ---------
2025.03.09-0613-rolling   Yes             Yes
1.5-rolling-202502190007

sam@cr# show nat source
 rule 10 {
     description IPSEC1
     destination {
         address <NET1>
     }
     outbound-interface {
         group WAN
     }
     protocol all
     translation {
         address 172.16.20.3
     }
 }
 rule 20 {
     description IPSEC2
     destination {
         address <NET2>
     }
     outbound-interface {
         group WAN
     }
     protocol all
     translation {
         address 172.16.20.3
     }
 }
 rule 100 {
     outbound-interface {
         group WAN
     }
     translation {
         address masquerade
     }
 }


sam@cr# show vpn
 ipsec {
     authentication {
         psk peer1 {
             id <WANIP>
             secret XXXX
         }
     }
     esp-group FOO1 {
         lifetime 28800
         mode tunnel
         pfs dh-group14
         proposal 1 {
             encryption aes256
             hash sha256
         }
     }
     ike-group FOO1 {
         key-exchange ikev2
         lifetime 28800
         proposal 1 {
             dh-group 14
             encryption aes256
             hash sha256
         }
     }
     site-to-site {
         peer peer1 {
             authentication {
                 local-id <WANIP)
                 mode pre-shared-secret
             }
             connection-type initiate
             ike-group FOO1
             ikev2-reauth inherit
             local-address <WANIP>
             remote-address <PEER1 IP>
             tunnel 1 {
                 esp-group FOO1
                 local {
                     prefix 172.16.20.0/28
                 }
                 remote {
                     prefix <NET1>
                 }
             }
             tunnel 2 {
                 esp-group FOO1
                 local {
                     prefix 172.16.20.0/28
                 }
                 remote {
                     prefix <NET2>
                 }
             }
         }
     }
 }

I’ve been doing some more experimenting and it seems that a restart is necessary after changing the vpn ipsec site to site configuration. The 1st tunnel as it reads in the configuration is the only one that works until it needs to be renewed, then it doesn’t work. If I delete the 1st tunnel, save and reboot the second one will work since it is now the 1st. swanctl looks to be properly configured so it seems the error is in strongswan.

sam@cr:~$ sudo swanctl -L
t38fax: IKEv2, no reauthentication, rekeying every 28800s
  local:  <WANIP>
  remote: <PEER1>
  local pre-shared key authentication:
    id: <WANIP>
  remote pre-shared key authentication:
    id: %any
  peer1-tunnel-1: TUNNEL, rekeying every 26181s
    local:  172.16.20.0/28
    remote: <NET1>
  peer1-tunnel-2: TUNNEL, rekeying every 26181s
    local:  172.16.20.0/28
    remote: <NET2>

Please check the ipsec logs if there are any error for traffic selectors.
$ show log ipsec
Also check the output of swanctl -l or show vpn ipsec sa for the tunnel status.

So the router(VyOS) has not been restarted in a couple days and if I ping a server on the tunnel which should prompt another SA nothing happens anywhere. No log nothing. Here is the tail end of the ipsec log from yesterday:

Mar 13 05:21:26 charon-systemd[2831]: sending DELETE for ESP CHILD_SA with SPI cf090a6d
Mar 13 05:21:26 charon-systemd[2831]: CHILD_SA closed
Mar 13 05:21:26 charon-systemd[2831]: generating INFORMATIONAL response 2413 [ D ]
Mar 13 05:21:26 charon-systemd[2831]: sending packet: from <WANIP>[4500] to <IPSEC_PEER>[4500] (80 bytes)
Mar 13 05:21:26 charon-systemd[2831]: received packet: from <IPSEC_PEER>[4500] to <WANIP>[4500] (80 bytes)
Mar 13 05:21:26 charon-systemd[2831]: parsed INFORMATIONAL request 2414 [ D ]
Mar 13 05:21:26 charon-systemd[2831]: received DELETE for unknown ESP CHILD_SA with SPI 35456e77
Mar 13 05:21:26 charon-systemd[2831]: CHILD_SA closed
Mar 13 05:21:26 charon-systemd[2831]: generating INFORMATIONAL response 2414 [ ]
Mar 13 05:21:26 charon-systemd[2831]: sending packet: from <WANIP>[4500] to <IPSEC_PEER>[4500] (80 bytes)
Mar 13 05:21:26 charon-systemd[2831]: received packet: from <IPSEC_PEER>[4500] to <WANIP>[4500] (80 bytes)
Mar 13 05:21:26 charon-systemd[2831]: parsed INFORMATIONAL request 2415 [ D ]
Mar 13 05:21:26 charon-systemd[2831]: received DELETE for IKE_SA peer1[1]
Mar 13 05:21:26 charon-systemd[2831]: deleting IKE_SA peer1[1] between <WANIP>[<WANIP>]...<IPSEC_PEER>[<IPSEC_PEER>]
Mar 13 05:21:26 charon-systemd[2831]: IKE_SA deleted
Mar 13 05:21:26 charon-systemd[2831]: generating INFORMATIONAL response 2415 [ ]
Mar 13 05:21:26 charon-systemd[2831]: sending packet: from <WANIP>[4500] to <IPSEC_PEER>[4500] (80 bytes)
Mar 13 05:21:32 charon-systemd[2831]: CHILD_SA ESP/0xcf090a6d/<WANIP> not found for rekey

swanctl -l has no output
swanctl -L still does

sam@cr:~$ sudo swanctl -l
sam@cr:~$ sudo swanctl -L
peer1: IKEv2, no reauthentication, rekeying every 28800s
  local:  <WANIP>
  remote: <PEER>
  local pre-shared key authentication:
    id: <WANIP>
  remote pre-shared key authentication:
    id: %any
  peer1-tunnel-1: TUNNEL, rekeying every 26181s
    local:  172.16.20.0/28
    remote: <NET1>
  peer1-tunnel-2: TUNNEL, rekeying every 26181s
    local:  172.16.20.0/28
    remote: <NET2>

show vpn ipsec sa has no output as well

sam@cr:~$ show vpn ipsec sa
Connection    State    Uptime    Bytes In/Out    Packets In/Out    Remote address    Remote ID    Proposal
------------  -------  --------  --------------  ----------------  ----------------  -----------  ----------

A reboot of the system will give output for swanctl -l and show vpn ipsec sa

sam@cr:~$ sudo swanctl -l
t38fax: #1, ESTABLISHED, IKEv2, ba979f4d572e66de_i* 97c41e5883903bee_r
  local  'WANIP' @ <WANIP>[4500]
  remote '<PEERIP>' @ <PEERIP>[4500]
  AES_CBC-256/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_2048
  established 186s ago, rekeying in 26726s
  peer1-tunnel-1: #1, reqid 1, INSTALLED, TUNNEL, ESP:AES_CBC-256/HMAC_SHA2_256_128
    installed 186s ago, rekeying in 23983s, expires in 28614s
    in  caf83dff,    336 bytes,     4 packets,    11s ago
    out 06a01bb1,    336 bytes,     4 packets,    11s ago
    local  172.16.20.0/28
    remote <NET1>
sam@cr:~$ show vpn ipsec sa
Connection       State    Uptime    Bytes In/Out    Packets In/Out    Remote address    Remote ID    Proposal
---------------  -------  --------  --------------  ----------------  ----------------  -----------  -----------------------------
peer1-tunnel-1  up       3m27s     336B/336B       4/4               <PEERIP>        <PEERIP>    AES_CBC_256/HMAC_SHA2_256_128

In summary there are 2 issues that may be related:

  1. Only 1 of 2 tunnels is ever able to come up
  2. After a period of time no tunnels or SA’s will come up unless the strongswan service is rebooted using:sudo systemctl restart strongswan

I guess I would hypothesize that the strongswan service internally crashes after 1 tunnel is created, therefore it is not able to create another or renew that 1 tunnel.

I submitted a bug request here:
:anchor: T7257 Multiple IPSEC SA not initiating/IPSec SA not renewing

Finally figured this one out, had to disable PFS in the ESP because that prevented future SAs from being made, including additional tunnels and renewing tunnels.

However that doesnt sound like a solution to me.

Why wouldnt PFS (perfect forward secrecy) work?

I wouldnt use an encryption or encrypted vpntunnel without it…

I don’t control the other end of the tunnel. Apparently it’s off by default in edgeos. I’m not sure how important it is but your comment makes me want to see I can get the administrator on the other side of the tunnel to enable it.

I tend to aim at the highest available common settings rather than lowest defaults.

So that gives the highest common hash, algo and diffie hellman group.

Also using tunnelmode aka NAT-T (NAT Traversal) which will waste some bytes as header (since the packet will be a UDP outer) but will encrypt also the original IP-headers (as in someone intercepting the traffic in between will just see your VPN-GW to VPN-GW IP-addresses and not the actual client/server utilizing encrypted VPN).

The DH-group is the “PFS” (perfect forward secrecy) stuff so use as high DH-group which both ends supports. What DH does is instead of using the actual passphrase or certificate as the encryption key it will make up a sessionkey so the pasphrase/certificate is only used to authenticate the opposite side. In short even if someone later one figures out the passphrase or certificates being used the encrypted traffic cannot be decrypted (unless bruteforced but bruteforce as method always exists).

This depends also on what you are trying to protect and why but I assume you want to protect something otherwise you wouldnt be using IPSec/OpenVPN/Wireguard to begin with but something in cleartext such as GRE or VXLAN.

This site have a great summary of various settings:

1 Like

@Apachez Thank you for that explanation; I was able to get PFS enabled on the other side after you made a good case for it.

Cheers!