Cluster Configuration Disappears After Reload

m0j0 · October 8, 2015, 3:19pm

I’ve encountered a strange issue that I’ve seen on three different physical servers which doesn’t happen with the same configuration on virtual machines.

I have two servers in a cluster and when I reboot either of the servers they come back up with no cluster configuration and I have to manually enter the cluster config again and commit. I’m definitely saving the configuration after commit and I’ve replicated the issue a number of times on different machines. I don’t know if it’s relevant, but the servers in question are Dell R610’s with hardware RAID. All other configuration changes I make are preserved across reboots. Performing the same tests with the same cluster configuration on virtual machines does not have this outcome.

Any ideas?

biedron · October 9, 2015, 6:08am

Hi,

I have same issues, but witch openvpn. I have configured, commit is pass throught ok, but after reboot, config disapere.
Check, /var/log/vyatta/vyatta-commit.log for commit error when system boots. Probaly you have error commit.

m0j0 · October 12, 2015, 2:16pm

You are indeed correct. If I look at my vyatta-commit.log I see the following…

[code][[cluster]] failed
[ service conntrack-sync ]
conntrack-sync error: Clustering isn’t running

[[service conntrack-sync]] failed
Commit failed
[ system host-name vyos-sv4-sec ]
Stopping enhanced syslogd: rsyslogd.
Starting enhanced syslogd: rsyslogd.

[ system time-zone UTC ]
Stopping enhanced syslogd: rsyslogd.
Starting enhanced syslogd: rsyslogd.

[ zone-policy zone dmz interface eth0.500 ]
interface eth0.500 does not exist on system

[ zone-policy zone trust interface eth1.100 ]
interface eth1.100 does not exist on system

[ system package repository community ]
Adding new entry to /etc/apt/sources.list…

[ system ntp ]
Starting NTP server: ntpd.

[ service ssh ]
Restarting OpenBSD Secure Shell server: sshd.

[ vpn ]
Clustering configured - not restarting ipsec

[ cluster ]
Cluster configuration error: interface eth1.100 is not connected

[[cluster]] failed
Commit failed[/code]

It appears the clustering tries to come up before the eth1.100 is up and ready which causes it to fail. Would this be considered a bug? I’ve managed to replicate this on a few different machines and I’m running another cluster pair up now which I’m sure will exhibit the same issue. Is there a way to delay the cluster commit to ensure the interfaces are up (assuming this is the problem)?

m0j0 · October 13, 2015, 12:16pm

I think I’ve found a workaround that appears to do the trick so far. I’ve allocated my unused eth3 interface to connect the two cluster members directly via crossover cable. I’ve rebooted a couple of times and I’m not seeing any cluster errors and the cluster configuration is loading on boot.

lawrence.pan · October 22, 2015, 12:14pm

use vyos 1.1.6

this configure commit & save

if reboot the server , configure was gone. …

how to fix this issue ???

–
configure
set cluster dead-interval 1000
set cluster group CLUSTER auto-failback ture
set cluster group CLUSTER primary VR-1
set cluster group CLUSTER secondary VR-2
set cluster group CLUSTER service 10.1.1.254/24/eth1.101
set cluster group CLUSTER service 10.1.2.254/24/eth1.102
set cluster interface eth0
set cluster interface eth1.101
set cluster interface eth1.102
set cluster keepalive-interval 200
set cluster mcast-group 239.1.0.254
set cluster monitor-dead-interval 1000
set cluster pre-shared-secret !secret!
set interface ethernet eth0 addr 12.34.45.1/24
set interface ethernet eth1 vif 911 addr 10.1.1.252/24
set interface ethernet eth1 vif 912 addr 10.1.2.252/24
set system hostname VR-1
commit
save

Temporary : After boot

lawrence.pan · October 24, 2015, 10:19am

http://bugzilla.vyos.net/show_bug.cgi?id=608