-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Description
Summary
Add a new collector to expose metrics from /proc/net/netfilter/nfnetlink_queue, enabling monitoring of userspace packet processing queues.
Motivation
NFQUEUE is a critical component for systems running userspace packet processing applications such as:
- fail2ban - Intrusion prevention system
- Suricata - Network threat detection engine (IPS mode)
- snort - Network intrusion detection system
- Custom packet filtering applications - Any application using libnetfilter_queue
When NFQUEUE queues become full, packets are dropped silently, causing hard to diagnose network issues. Currently, there is no way to monitor these queues through node_exporter, forcing operators to manually inspect /proc/net/netfilter/nfnetlink_queue or use custom scripts.
Use Cases
1. Detecting Queue Overflows
When userspace applications cannot process packets fast enough, the queue fills up and packets are dropped. Monitoring queue length and drop counters allows operators to:
- Alert before queues overflow
- Identify performance bottlenecks in packet processing applications
- Right-size queue limits based on actual usage
2. Capacity Planning
Understanding queue utilization patterns helps with:
- Scaling decisions for packet processing infrastructure
- Tuning
--queue-numand--queue-balanceiptables parameters - Identifying peak traffic periods that stress packet processing
3. Troubleshooting Network Issues
Silent packet drops from full NFQUEUE queues are notoriously difficult to diagnose. Metrics would provide:
- Visibility into drop rates per queue
- Correlation between application performance and queue backlogs
- Historical data for post-incident analysis
Technical Details
Data Source
The data is available in /proc/net/netfilter/nfnetlink_queue:
0 31621 0 2 65531 0 0 50 1
Fields (left to right):
queue_number- Queue ID (label)peer_portid- Netlink port ID of userspace applicationqueue_total- Current number of packets in queuecopy_mode- Copy mode (0=none, 1=meta, 2=packet)copy_range- Copy range (bytes)queue_dropped- Packets dropped due to full queueuser_dropped- Packets dropped in userspaceid_sequence- Packet ID sequence numberunknown- Reserved field
Proposed Metrics
# HELP node_nfqueue_queue_total Current number of packets waiting in queue.
# TYPE node_nfqueue_queue_total gauge
node_nfqueue_queue_total{queue="0"} 31621
# HELP node_nfqueue_queue_dropped_total Packets dropped due to full queue.
# TYPE node_nfqueue_queue_dropped_total counter
node_nfqueue_queue_dropped_total{queue="0"} 0
# HELP node_nfqueue_user_dropped_total Packets dropped in userspace.
# TYPE node_nfqueue_user_dropped_total counter
node_nfqueue_user_dropped_total{queue="0"} 0
# HELP node_nfqueue_id_sequence_total Packet ID sequence number.
# TYPE node_nfqueue_id_sequence_total counter
node_nfqueue_id_sequence_total{queue="0"} 50
# HELP node_nfqueue_info Non-numeric metadata about the queue (value is always 1).
# TYPE node_nfqueue_info gauge
node_nfqueue_info{queue="0",peer_portid="2",copy_mode="meta",copy_range="65531"} 1
procfs Support
The parsing logic is already implemented in github.com/prometheus/procfs as of v0.18.0:
Node exporter already uses procfs v0.19.2, so no dependency update is required.
Implementation Notes
- Default state: Disabled (similar to other specialized collectors)
- Build tag:
!nonf_queueor similar - No special permissions required: The
/proc/net/netfilter/nfnetlink_queuefile is world-readable - Graceful degradation: Return
ErrNoDataif file doesn't exist (NFQUEUE not in use)
Example Alert Rules
groups:
- name: nfqueue
rules:
- alert: NFQueueDropping
expr: rate(node_nfqueue_queue_dropped_total[5m]) > 0
for: 2m
labels:
severity: warning
annotations:
summary: "NFQUEUE {{ $labels.queue }} is dropping packets"
description: "Queue {{ $labels.queue }} on {{ $labels.instance }} has dropped {{ $value }} packets/sec"
- alert: NFQueueBacklog
expr: node_nfqueue_queue_total > 1000
for: 5m
labels:
severity: warning
annotations:
summary: "NFQUEUE {{ $labels.queue }} has high backlog"
description: "Queue {{ $labels.queue }} on {{ $labels.instance }} has {{ $value }} packets waiting"References
- procfs implementation: add netfilter queue support procfs#677
- Netfilter NFQUEUE documentation: https://netfilter.org/projects/libnetfilter_queue/
- Kernel source:
net/netfilter/nfnetlink_queue.c