Nf_queue full messages - breaking connections?

m0j072 · March 2, 2020, 9:13pm

I have a bunch of servers sitting in a DMZ that are all continuously connecting to backend mysql servers. I’ve started seeing occasional errors in these server logs that either mysql has gone away or cannot connect errors.
The firewall configuration between the DMZ and backend networks is two clustered VyOS 1.1.6 boxes. They are dual hex-core Xeon Dells with 64GB RAM each. I was having issues with one of these boxes a couple of weeks back and we failed over to the other and the errors went away for a while, but they’ve just started again on all our servers. The only thing I can see in our firewall logs that looks like it might match up is the following:

Mar 1 23:34:42 ukld5fw02 kernel: [1039678.461023] nf_queue: full at 1024 entries, dropping packets(s)

I bit of a read around seems to suggest something to do with conntrack table settings, but I’m getting lost as to what might need doing here. Being that the boxes are clustered, I’m running conntrack-sync between the two boxes. Here are all my conntrack settings:-

set service conntrack-sync event-listen-queue-size ‘64’
set service conntrack-sync failover-mechanism cluster group ‘ld5-cluster’
set service conntrack-sync interface ‘eth1.100’
set service conntrack-sync mcast-group ‘225.0.0.50’
set service conntrack-sync sync-queue-size ‘8’

set system conntrack expect-table-size ‘64000’
set system conntrack hash-size ‘256000’
set system conntrack modules nfs ‘disable’
set system conntrack table-size ‘2000000’

This is of course set the same on both boxes.

I can’t work out the correlation between the nf_queue 1024 entries and any of the above settings so I’m not completely sure it is a conntrack issue and if so, how to resolve my issue (assuming that is the cause).

lasseoe · March 3, 2020, 7:42am

While not a solution or an explanation, Red Hat says:

NFQUEUE or “netfilter queue” is a method to inspect and allow/deny packets with a custom application.
The iptables firewall is configured to jump packets into an NFQUEUE queue.
There should then be a userspace program which uses the NFQUEUE API to attach to the above queue.
The userspace program receives packets, and returns allow/drop to the firewall.
The message nf_queue: full at 1024 entries, dropping packets(s) means the userspace program is not drawing packets from the NFQUEUE queue fast enough, or at all.

Judging by the fact the queue increases one-by-one, it would seem the userspace process has stopped receiving from the queue.

The userspace queue processing application is usually the customer’s responsibility to troubleshoot, Red Hat do not supply or support custom applications which perform this action.

So, yeah a userspace daemon, perhaps conntrackd?