Randomly unresponsive system


#1

I changed a vyatta 6.3 setup to new hardware about 6 months ago, I took the opportunity to upgrade to VyOS everything went well.

Since I have had a few reports of non- responsive vlans (no dhcp and/or DNS) in the last 2 weeks it has gotten much worse.

The environment is a proxmox server on a decent HP server, currently no other vm’s active and plenty of resources, connected to vlan tagged ports on a draytek G2260 then to dumb switches, room ports. The very similar vyatta install has been running for 3 years.

On boot everything works as expected then ssh and ftp stop responding and the client networks either lose dns and/or their connection. Today I have seen the system become unresponsive after 10 min - 84 min 7 times so far today.

I can’t see anything in the logs of interest other than EXT4-fs (vda1): warning: checktime reached, running e2fsck is recommended.

was running 1.1.6 updated to 1.1.7 today, also changed the primary dns from a ISP server to 8.8.4.4

Any advice on where to look next?

config.boot attached


#2

Hello,
file for some reason was not attached,
it´s hard to tell what can be wrong,
you may want to setup VRRP for failover and try to capture dumps once an issue occurs.
This is not correct behaviour.


#3

Thank you I’m looking into VRRP options now, I have edited the original post to include the boot.config now, apologies


#4

so the problem is still affecting my installation, I though I would do an update with what I have tried in the hope someone will have some suggestions.

[list]
[]Original vm of vyatta 6.3 cloned and in-place upgrade to vyos 1.1.6 - periodic locks, very frequent 48 hrs prior to original ticket.
[
]Build a new vm using 1.1.7 iso, configured using boot.config - randomly locked up once in 12 hours
[*]removed vm environment and installed on hp 360 g6, and configured using /opt/vyatta/sbin/vyatta-config-gen-sets.pl >setcmds.txt ( to allow for hw id changes) same behaviour (locked after 20 min, reboot and locked again)
[/list]I’ve gone back to the newest vm that is most stable as I have better remote control. The only configuration change I made was - set service dns forwarding listen-on ‘eth1’

Any suggestions would be appreciated.


#5

Pretty simple setup,
lot of DHCP and etc, lead me to an idea that it can be some memory leak,
but it´s just idea, which need to be proved or discarded.
You likely want to setup some SNMP monitoring and Syslog redirection to collect more data

Thanks for config.
How much traffic you have by the way?


#6

Thank you, I’ve setup Syslog redirection and will post a log tomorrow, so far the only thing of interest is some failed logins from what looks like brute force.

As for traffic it varies from only 100mb total on a weekend to 3gb down 1gb up on a weekday, with 5-10 dhcp users split over 3 vlans (the majority of my setup is still serviced by the older vyatta box)


#7

it will be good to monitor memory/cpu consumption as well, so we can get around idea what happens