I have a bunch of servers sitting in a DMZ that are all continuously connecting to backend mysql servers. I’ve started seeing occasional errors in these server logs that either mysql has gone away or cannot connect errors.
The firewall configuration between the DMZ and backend networks is two clustered VyOS 1.1.6 boxes. They are dual hex-core Xeon Dells with 64GB RAM each. I was having issues with one of these boxes a couple of weeks back and we failed over to the other and the errors went away for a while, but they’ve just started again on all our servers. The only thing I can see in our firewall logs that looks like it might match up is the following:
Mar 1 23:34:42 ukld5fw02 kernel: [1039678.461023] nf_queue: full at 1024 entries, dropping packets(s)
I bit of a read around seems to suggest something to do with conntrack table settings, but I’m getting lost as to what might need doing here. Being that the boxes are clustered, I’m running conntrack-sync between the two boxes. Here are all my conntrack settings:-
set service conntrack-sync event-listen-queue-size ‘64’
set service conntrack-sync failover-mechanism cluster group ‘ld5-cluster’
set service conntrack-sync interface ‘eth1.100’
set service conntrack-sync mcast-group ‘225.0.0.50’
set service conntrack-sync sync-queue-size ‘8’
set system conntrack expect-table-size ‘64000’
set system conntrack hash-size ‘256000’
set system conntrack modules nfs ‘disable’
set system conntrack table-size ‘2000000’
This is of course set the same on both boxes.
I can’t work out the correlation between the nf_queue 1024 entries and any of the above settings so I’m not completely sure it is a conntrack issue and if so, how to resolve my issue (assuming that is the cause).