Skip to content

Conversation

@nfebe
Copy link
Contributor

@nfebe nfebe commented Dec 23, 2025

No description provided.

…dict

Previously, blocked IPs were stored in the database but never applied
to nginx. Now IPs are enforced instantly via nginx's access_by_lua
checking the shared_dict, with immediate notification on block/unblock.

Signed-off-by: nfebe <fenn25.fn@gmail.com>
Autoblock now triggers on:
- 10+ 404 responses in 2 min (path probing)
- 5+ auth failures (401/403) in 2 min
- 20+ unique paths tried in 2 min (scanning)
- 30+ hits to same path in 2 min (hammering)
- Known scanner user agents
- High request rate (60+ req/2min)

Signed-off-by: nfebe <fenn25.fn@gmail.com>
@sourceant
Copy link

sourceant bot commented Dec 23, 2025

Code Review Summary

✨ This pull request introduces a comprehensive real-time IP blocking and security event detection system, significantly enhancing the agent's ability to protect deployed services. The changes involve substantial updates to the security manager, API handlers, Nginx Lua scripts, and health checks to facilitate granular threat detection and immediate mitigation. The new Detector logic, incorporating various behavioral heuristics, is a major improvement, allowing for more sophisticated identification of malicious activity.

🚀 Key Improvements

  • Real-time IP blocking and unblocking capabilities integrated with Nginx Lua for immediate enforcement.
  • Enhanced security event classification and auto-blocking logic in security/detector.go with granular thresholds for identifying scanners, high request rates, and suspicious access patterns.
  • Updated API endpoints in internal/api/security_handlers.go for managing security events, blocked IPs, and protected routes.
  • Comprehensive health checks (internal/infra/manager.go) now validate the Nginx security setup, including shared dictionaries and internal API reachability.
  • Centralized configuration for AutoBlockDuration in the security manager allows for flexible auto-blocking policies.
  • Agent communication with Nginx's internal API for instant security actions, improving response times to threats.

💡 Minor Suggestions

  • Define magic numbers (e.g., default TTLs, 1-year block duration) as named constants in both Go and Lua for improved readability and maintainability.
  • Ensure consistent logging messages across handlers, especially for errors, to aid debugging and troubleshooting.
  • Centralize internal API paths as constants to reduce redundancy and potential for typos across the codebase.

🚨 Critical Issues

  • The current implementation of inter-process communication using docker exec curl is inefficient and introduces brittle dependencies. This should be refactored to use native Go HTTP clients for notifyNginxBlockIP, notifyNginxUnblockIP, and checkNginxInternalAPIReachable to improve robustness, performance, and reduce external dependencies.
  • Address the inconsistency in how ttl=0 (intended for 'permanent' blocks or 'no expiration') is handled between the Go agent and the Nginx Lua security.lua script. The Go agent currently translates 0 into a 1-year duration, which might conflict with ngx.shared.dict's interpretation of 0 as 'no expiration'. Clarify the intended behavior for permanent blocks and align the implementation on both sides.
  • Implement logging for errors that occur during m.db.BlockIP calls within the IngestEvent function in internal/security/manager.go to prevent silent failures in the auto-blocking mechanism.

@sourceant
Copy link

sourceant bot commented Dec 23, 2025

🔍 Code Review

💡 1. **templates/infra/nginx/lua/security.lua** (Lines 280-281) - STYLE

Adding local to functions json_response, handle_block_ip_request, handle_unblock_ip_request, and handle_refresh_request ensures they are scoped to the current file and do not pollute the global namespace. This is a good practice for Lua module development.

Suggested Code:

 local function json_response(status, data)
     ngx.status = status
 local function handle_block_ip_request()
     if ngx.req.get_method() ~= "POST" then

Current Code:

 local function json_response(status, data)
     ngx.status = status
 function _M.handle_block_ip_request()
     if ngx.req.get_method() ~= "POST" then
💡 2. **templates/infra/nginx/lua/security.lua** (Line 285) - CLARITY

It's a good practice to log errors with more context, including the ngx.var.remote_addr when handling requests to internal APIs, especially if there's a problem decoding the JSON. This helps in debugging potential malformed requests.

Suggested Code:

         json_response(400, {error = "Invalid JSON: " .. (err or "unknown")})
         ngx.log(ngx.ERR, "Invalid JSON received from ", ngx.var.remote_addr, ": ", err or "unknown")

Current Code:

         json_response(400, {error = "Invalid JSON: " .. (err or "unknown")})
💡 3. **templates/infra/nginx/lua/security.lua** (Line 288) - CLARITY

The default TTL for blocking in the Lua script is 24 hours (86400 seconds) if not provided by the agent. This is reasonable. However, the agent's notifyNginxBlockIP function uses 86400 * 365 (1 year) if ttlSeconds is 0 or less. Explicitly documenting this default in the Lua code and perhaps making it align with the agent's internal 'permanent' duration (or making the agent send a specific 'permanent' flag) would improve clarity.

Suggested Code:

    local ip = data.ip
    local ttl = data.ttl or (365 * 86400) -- Default to 1 year if not specified, matching agent's "permanent" block

Current Code:

    local ip = data.ip
    local ttl = data.ttl or 86400  -- default 24 hours
💡 4. **templates/infra/nginx/lua/security.lua** (Line 303) - CLARITY

Consistent error logging for internal API requests is beneficial for debugging. If a shared dictionary is unexpectedly unavailable, it's a critical error that should be logged server-side.

Suggested Code:

        json_response(500, {error = "Shared dict not available"})
        ngx.log(ngx.ERR, "Shared dict 'blocked_ips' not available for block IP request from ", ngx.var.remote_addr)

Current Code:

        json_response(500, {error = "Shared dict not available"})
💡 5. **templates/infra/nginx/lua/security.lua** (Lines 310-311) - CLARITY

Consistent error logging is crucial. If dict:set fails, it's important to log this server-side with context.

Suggested Code:

    if not ok then
        ngx.log(ngx.ERR, "Failed to set IP ", ip, " in blocked_ips dict for block request from ", ngx.var.remote_addr, ": ", set_err or "unknown")
        json_response(500, {error = "Failed to block IP: " .. (set_err or "unknown")})
        return
    end

Current Code:

    if not ok then
        json_response(500, {error = "Failed to block IP: " .. (set_err or "unknown")})
        return
    end
💡 6. **templates/infra/nginx/lua/security.lua** (Line 315) - STYLE

Adding local to functions json_response, handle_block_ip_request, handle_unblock_ip_request, and handle_refresh_request ensures they are scoped to the current file and do not pollute the global namespace. This is a good practice for Lua module development.

Suggested Code:

 local function handle_unblock_ip_request()
     if ngx.req.get_method() ~= "POST" then

Current Code:

 function _M.handle_unblock_ip_request()
     if ngx.req.get_method() ~= "POST" then
💡 7. **templates/infra/nginx/lua/security.lua** (Line 319) - CLARITY

Log internal API request errors with context. If JSON decoding fails, log it with the source IP.

Suggested Code:

         json_response(400, {error = "Invalid JSON: " .. (err or "unknown")})
         ngx.log(ngx.ERR, "Invalid JSON received for unblock IP request from ", ngx.var.remote_addr, ": ", err or "unknown")

Current Code:

         json_response(400, {error = "Invalid JSON: " .. (err or "unknown")})
💡 8. **templates/infra/nginx/lua/security.lua** (Line 331) - CLARITY

Consistent error logging for internal API requests. If the shared dictionary is unavailable, it's a critical error.

Suggested Code:

        json_response(500, {error = "Shared dict not available"})
        ngx.log(ngx.ERR, "Shared dict 'blocked_ips' not available for unblock IP request from ", ngx.var.remote_addr)

Current Code:

        json_response(500, {error = "Shared dict not available"})
💡 9. **templates/infra/nginx/lua/security.lua** (Line 336) - STYLE

Adding local to functions json_response, handle_block_ip_request, handle_unblock_ip_request, and handle_refresh_request ensures they are scoped to the current file and do not pollute the global namespace. This is a good practice for Lua module development.

Suggested Code:

 local function handle_refresh_request()
     if ngx.req.get_method() ~= "POST" then

Current Code:

 function _M.handle_refresh_request()
     if ngx.req.get_method() ~= "POST" then

Verdict: APPROVE

Posted as a comment because posting a review failed.

- Use configurable AutoBlockDuration instead of hardcoded 24h
- Add sync notice comment to e2e test Lua file
- Use cjson.encode for all Lua JSON responses

Signed-off-by: nfebe <fenn25.fn@gmail.com>
- Add nginx_internal_api_reachable health check to verify port 8081
- Add nginx_conf_has_blocked_ips_dict, nginx_conf_has_ip_blocking,
  nginx_conf_has_internal_api to critical health checks
- Enhance Go ingestSecurityEvent logging with client IP and event context
- Enhance Lua handlers with detailed logging (client IP, body preview,
  operation context) for better debuggability
- Sync e2e test security.lua with template changes

Signed-off-by: nfebe <fenn25.fn@gmail.com>
Copy link

@sourceant sourceant bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review complete. See the overview comment for a summary.

// notifyNginxBlockIP notifies nginx to immediately add an IP to its blocked list
func (s *Server) notifyNginxBlockIP(ip string, ttlSeconds int) error {
if s.config.Nginx.ContainerName == "" {
return fmt.Errorf("nginx container name not configured")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value 86400 * 365 represents 1 year in seconds. It's a 'magic number' that would be clearer if defined as a named constant (e.g., security.PermanentBlockDurationSeconds). This improves readability and makes future modifications easier.

Suggested change
return fmt.Errorf("nginx container name not configured")
ttlSeconds = security.PermanentBlockDurationSeconds // Defined as 86400 * 365 elsewhere

end

local ip = data.ip
local ttl = data.ttl or 86400 -- default 24 hours
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ttl default of 86400 (24 hours) is a magic number. It should be defined as a named constant for clarity and easier maintenance. More critically, clarify the handling of ttl=0:

  • In Go's notifyNginxBlockIP, ttlSeconds <= 0 results in a 1-year block. This means the Lua script will never receive ttl=0 from the agent.
  • If ngx.shared.dict:set with exptime=0 means 'no expiration' (permanent), this is currently only achievable by calling the Lua internal API directly with ttl=0.

To ensure consistency, consider changing the Go agent to send 0 for permanent blocks (if that's the desired behavior for ngx.shared.dict), and then the Lua script should explicitly handle 0 as 'no expiration'. If the agent's definition of a 'permanent' block is truly 1 year, then the Lua default for nil or negative values should align, and 86400 should be a constant.

Suggested change
local ttl = data.ttl or 86400 -- default 24 hours
local DEFAULT_TTL = 86400 -- Default 24 hours (define as a constant)
local ttl = data.ttl
if ttl == nil or ttl < 0 then
ttl = DEFAULT_TTL -- Default if not provided or invalid
end
-- If ttl is explicitly 0, it means 'no expiration' for ngx.shared.dict:set
-- Ensure this aligns with the agent's definition of a permanent block.

local ip = data.ip
local ttl = data.ttl or 86400 -- default 24 hours

if not ip then
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good practice to use named constants for API endpoints. Defining _internal/security/block-ip as a constant would improve maintainability across both Go and Lua codebases.

Suggested change
if not ip then
local INTERNAL_BLOCK_IP_ENDPOINT = "_internal/security/block-ip" -- Define as a global constant
if not ip then
ngx.log(ngx.ERR, "block-ip: missing IP in request from ", client_ip)

Comment on lines +743 to +759
"ttl": ttlSeconds,
}
jsonPayload, err := json.Marshal(payload)
if err != nil {
return fmt.Errorf("failed to marshal payload: %w", err)
}

curlCmd := fmt.Sprintf(
`curl -s -X POST -H "Content-Type: application/json" -d '%s' http://127.0.0.1:8081/_internal/security/block-ip`,
string(jsonPayload),
)

cmd := exec.Command("docker", "exec", s.config.Nginx.ContainerName, "sh", "-c", curlCmd)
output, err := cmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to notify nginx: %s - %w", string(output), err)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using docker exec curl for inter-container communication is generally discouraged. It introduces overhead, relies on curl and sh being available in the Nginx container, and can have security implications if the curlCmd is not properly sanitized. A more robust and performant solution would be to use a native Go HTTP client to send requests to Nginx's internal API, leveraging Docker's internal networking (e.g., http://nginx_container_name:8081). This would remove external dependencies and simplify error handling.

Additionally, consider defining the internal API path _internal/security/block-ip as a constant.

Suggested change
"ttl": ttlSeconds,
}
jsonPayload, err := json.Marshal(payload)
if err != nil {
return fmt.Errorf("failed to marshal payload: %w", err)
}
curlCmd := fmt.Sprintf(
`curl -s -X POST -H "Content-Type: application/json" -d '%s' http://127.0.0.1:8081/_internal/security/block-ip`,
string(jsonPayload),
)
cmd := exec.Command("docker", "exec", s.config.Nginx.ContainerName, "sh", "-c", curlCmd)
output, err := cmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to notify nginx: %s - %w", string(output), err)
}
// Recommended: Use a native Go HTTP client for robustness and performance.
// Example (conceptual, actual implementation might vary based on network setup):
// client := &http.Client{Timeout: 5 * time.Second}
// req, err := http.NewRequest("POST", fmt.Sprintf("http://%s:8081/_internal/security/block-ip", s.config.Nginx.ContainerName), bytes.NewBuffer(jsonPayload))
// if err != nil { return fmt.Errorf("failed to create request: %w", err) }
// req.Header.Set("Content-Type", "application/json")
// resp, err := client.Do(req)
// if err != nil { return fmt.Errorf("failed to notify nginx: %w", err) }
// defer resp.Body.Close()
// if resp.StatusCode != http.StatusOK { /* handle error */ }
// Current implementation using docker exec curl:
curlCmd := fmt.Sprintf(
`curl -s -X POST -H "Content-Type: application/json" -d '%s' http://127.0.0.1:8081/_internal/security/block-ip`,
string(jsonPayload),
)
cmd := exec.Command("docker", "exec", s.config.Nginx.ContainerName, "sh", "-c", curlCmd)
output, err := cmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to notify nginx: %s - %w", string(output), err)
}

Comment on lines +776 to +788
return fmt.Errorf("failed to marshal payload: %w", err)
}

curlCmd := fmt.Sprintf(
`curl -s -X POST -H "Content-Type: application/json" -d '%s' http://127.0.0.1:8081/_internal/security/unblock-ip`,
string(jsonPayload),
)

cmd := exec.Command("docker", "exec", s.config.Nginx.ContainerName, "sh", "-c", curlCmd)
output, err := cmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to notify nginx: %s - %w", string(output), err)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to notifyNginxBlockIP, using docker exec curl for notifyNginxUnblockIP introduces unnecessary dependencies and overhead. A native Go HTTP client is preferred for inter-process communication within a Docker environment. Also, consider defining the _internal/security/unblock-ip path as a constant.

Suggested change
return fmt.Errorf("failed to marshal payload: %w", err)
}
curlCmd := fmt.Sprintf(
`curl -s -X POST -H "Content-Type: application/json" -d '%s' http://127.0.0.1:8081/_internal/security/unblock-ip`,
string(jsonPayload),
)
cmd := exec.Command("docker", "exec", s.config.Nginx.ContainerName, "sh", "-c", curlCmd)
output, err := cmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to notify nginx: %s - %w", string(output), err)
}
// Recommended: Use a native Go HTTP client.
// (Similar conceptual implementation as notifyNginxBlockIP)
curlCmd := fmt.Sprintf(
`curl -s -X POST -H "Content-Type: application/json" -d '%s' http://127.0.0.1:8081/_internal/security/unblock-ip`,
string(jsonPayload),
)
cmd := exec.Command("docker", "exec", s.config.Nginx.ContainerName, "sh", "-c", curlCmd)
output, err := cmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to notify nginx: %s - %w", string(output), err)
}

Comment on lines +1037 to +1046
func (m *Manager) checkNginxInternalAPIReachable() bool {
testCmd := "curl -s --connect-timeout 2 --max-time 5 -X POST http://127.0.0.1:8081/_internal/security/refresh-blocked-ips 2>/dev/null | grep -q success && echo yes || echo no"

cmd := exec.Command("docker", "exec", m.config.Nginx.ContainerName, "sh", "-c", testCmd)
output, err := cmd.Output()
if err != nil {
return false
}
return strings.Contains(string(output), "yes")
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The checkNginxInternalAPIReachable function also uses docker exec curl. While acceptable for a health check, this still carries the overhead and dependency concerns mentioned for the notifyNginx*IP functions. If curl is not available or its behavior changes in the Nginx container, this health check might fail unexpectedly. Consider a more direct Go-based health check if possible within your container networking setup.

Suggested change
func (m *Manager) checkNginxInternalAPIReachable() bool {
testCmd := "curl -s --connect-timeout 2 --max-time 5 -X POST http://127.0.0.1:8081/_internal/security/refresh-blocked-ips 2>/dev/null | grep -q success && echo yes || echo no"
cmd := exec.Command("docker", "exec", m.config.Nginx.ContainerName, "sh", "-c", testCmd)
output, err := cmd.Output()
if err != nil {
return false
}
return strings.Contains(string(output), "yes")
}
func (m *Manager) checkNginxInternalAPIReachable() bool {
// Recommended: Use a native Go HTTP client for a more reliable health check.
// Example (conceptual):
// client := &http.Client{Timeout: 2 * time.Second}
// resp, err := client.Post(fmt.Sprintf("http://%s:8081/_internal/security/refresh-blocked-ips", m.config.Nginx.ContainerName), "application/json", nil)
// if err != nil { return false }
// defer resp.Body.Close()
// return resp.StatusCode == http.StatusOK // Or parse body for a specific success message
// Current implementation:
testCmd := "curl -s --connect-timeout 2 --max-time 5 -X POST http://127.0.0.1:8081/_internal/security/refresh-blocked-ips 2>/dev/null | grep -q success && echo yes || echo no"
cmd := exec.Command("docker", "exec", m.config.Nginx.ContainerName, "sh", "-c", testCmd)
output, err := cmd.Output()
if err != nil {
return false
}
return strings.Contains(string(output), "yes")
}

Comment on lines +81 to +85
_, err := m.db.BlockIP(event.SourceIP, "Auto-blocked due to suspicious activity", &expiresAt, true)
if err == nil {
result.AutoBlocked = true
result.BlockedIP = event.SourceIP
result.BlockTTL = int(autoBlockDuration.Seconds())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If m.db.BlockIP returns an error during auto-blocking, it's currently suppressed (if err == nil). While the overall IngestEvent might still return a result, this specific error should ideally be logged to ensure visibility of failures in the auto-blocking mechanism. Silent failures can lead to IPs not being blocked as expected.

Suggested change
_, err := m.db.BlockIP(event.SourceIP, "Auto-blocked due to suspicious activity", &expiresAt, true)
if err == nil {
result.AutoBlocked = true
result.BlockedIP = event.SourceIP
result.BlockTTL = int(autoBlockDuration.Seconds())
_, err := m.db.BlockIP(event.SourceIP, "Auto-blocked due to suspicious activity", &expiresAt, true)
if err != nil {
// Log the error but don't prevent the event from being returned
log.Printf("Security manager: failed to auto-block IP %s: %v", event.SourceIP, err)
} else {
result.AutoBlocked = true
result.BlockedIP = event.SourceIP
result.BlockTTL = int(autoBlockDuration.Seconds())
}

@nfebe nfebe merged commit ef5aff1 into main Dec 23, 2025
5 checks passed
@nfebe nfebe deleted the fix/ip-blocking-lua-shared-dict branch December 23, 2025 13:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants