-
Notifications
You must be signed in to change notification settings - Fork 0
fix(security): Enforce blocks, harden auto blocking #46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…dict Previously, blocked IPs were stored in the database but never applied to nginx. Now IPs are enforced instantly via nginx's access_by_lua checking the shared_dict, with immediate notification on block/unblock. Signed-off-by: nfebe <fenn25.fn@gmail.com>
Autoblock now triggers on: - 10+ 404 responses in 2 min (path probing) - 5+ auth failures (401/403) in 2 min - 20+ unique paths tried in 2 min (scanning) - 30+ hits to same path in 2 min (hammering) - Known scanner user agents - High request rate (60+ req/2min) Signed-off-by: nfebe <fenn25.fn@gmail.com>
Code Review Summary✨ This pull request introduces a comprehensive real-time IP blocking and security event detection system, significantly enhancing the agent's ability to protect deployed services. The changes involve substantial updates to the security manager, API handlers, Nginx Lua scripts, and health checks to facilitate granular threat detection and immediate mitigation. The new 🚀 Key Improvements
💡 Minor Suggestions
🚨 Critical Issues
|
🔍 Code Review💡 1. **templates/infra/nginx/lua/security.lua** (Lines 280-281) - STYLEAdding Suggested Code: Current Code: local function json_response(status, data)
ngx.status = status
function _M.handle_block_ip_request()
if ngx.req.get_method() ~= "POST" then💡 2. **templates/infra/nginx/lua/security.lua** (Line 285) - CLARITYIt's a good practice to log errors with more context, including the Suggested Code: Current Code: json_response(400, {error = "Invalid JSON: " .. (err or "unknown")})💡 3. **templates/infra/nginx/lua/security.lua** (Line 288) - CLARITYThe default TTL for blocking in the Lua script is 24 hours (86400 seconds) if not provided by the agent. This is reasonable. However, the agent's Suggested Code: Current Code: local ip = data.ip
local ttl = data.ttl or 86400 -- default 24 hours💡 4. **templates/infra/nginx/lua/security.lua** (Line 303) - CLARITYConsistent error logging for internal API requests is beneficial for debugging. If a shared dictionary is unexpectedly unavailable, it's a critical error that should be logged server-side. Suggested Code: Current Code: json_response(500, {error = "Shared dict not available"})💡 5. **templates/infra/nginx/lua/security.lua** (Lines 310-311) - CLARITYConsistent error logging is crucial. If Suggested Code: Current Code: if not ok then
json_response(500, {error = "Failed to block IP: " .. (set_err or "unknown")})
return
end💡 6. **templates/infra/nginx/lua/security.lua** (Line 315) - STYLEAdding Suggested Code: Current Code: function _M.handle_unblock_ip_request()
if ngx.req.get_method() ~= "POST" then💡 7. **templates/infra/nginx/lua/security.lua** (Line 319) - CLARITYLog internal API request errors with context. If JSON decoding fails, log it with the source IP. Suggested Code: Current Code: json_response(400, {error = "Invalid JSON: " .. (err or "unknown")})💡 8. **templates/infra/nginx/lua/security.lua** (Line 331) - CLARITYConsistent error logging for internal API requests. If the shared dictionary is unavailable, it's a critical error. Suggested Code: Current Code: json_response(500, {error = "Shared dict not available"})💡 9. **templates/infra/nginx/lua/security.lua** (Line 336) - STYLEAdding Suggested Code: Current Code: function _M.handle_refresh_request()
if ngx.req.get_method() ~= "POST" thenVerdict: APPROVE Posted as a comment because posting a review failed. |
- Use configurable AutoBlockDuration instead of hardcoded 24h - Add sync notice comment to e2e test Lua file - Use cjson.encode for all Lua JSON responses Signed-off-by: nfebe <fenn25.fn@gmail.com>
- Add nginx_internal_api_reachable health check to verify port 8081 - Add nginx_conf_has_blocked_ips_dict, nginx_conf_has_ip_blocking, nginx_conf_has_internal_api to critical health checks - Enhance Go ingestSecurityEvent logging with client IP and event context - Enhance Lua handlers with detailed logging (client IP, body preview, operation context) for better debuggability - Sync e2e test security.lua with template changes Signed-off-by: nfebe <fenn25.fn@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review complete. See the overview comment for a summary.
| // notifyNginxBlockIP notifies nginx to immediately add an IP to its blocked list | ||
| func (s *Server) notifyNginxBlockIP(ip string, ttlSeconds int) error { | ||
| if s.config.Nginx.ContainerName == "" { | ||
| return fmt.Errorf("nginx container name not configured") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value 86400 * 365 represents 1 year in seconds. It's a 'magic number' that would be clearer if defined as a named constant (e.g., security.PermanentBlockDurationSeconds). This improves readability and makes future modifications easier.
| return fmt.Errorf("nginx container name not configured") | |
| ttlSeconds = security.PermanentBlockDurationSeconds // Defined as 86400 * 365 elsewhere |
| end | ||
|
|
||
| local ip = data.ip | ||
| local ttl = data.ttl or 86400 -- default 24 hours |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ttl default of 86400 (24 hours) is a magic number. It should be defined as a named constant for clarity and easier maintenance. More critically, clarify the handling of ttl=0:
- In Go's
notifyNginxBlockIP,ttlSeconds <= 0results in a 1-year block. This means the Lua script will never receivettl=0from the agent. - If
ngx.shared.dict:setwithexptime=0means 'no expiration' (permanent), this is currently only achievable by calling the Lua internal API directly withttl=0.
To ensure consistency, consider changing the Go agent to send 0 for permanent blocks (if that's the desired behavior for ngx.shared.dict), and then the Lua script should explicitly handle 0 as 'no expiration'. If the agent's definition of a 'permanent' block is truly 1 year, then the Lua default for nil or negative values should align, and 86400 should be a constant.
| local ttl = data.ttl or 86400 -- default 24 hours | |
| local DEFAULT_TTL = 86400 -- Default 24 hours (define as a constant) | |
| local ttl = data.ttl | |
| if ttl == nil or ttl < 0 then | |
| ttl = DEFAULT_TTL -- Default if not provided or invalid | |
| end | |
| -- If ttl is explicitly 0, it means 'no expiration' for ngx.shared.dict:set | |
| -- Ensure this aligns with the agent's definition of a permanent block. |
| local ip = data.ip | ||
| local ttl = data.ttl or 86400 -- default 24 hours | ||
|
|
||
| if not ip then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good practice to use named constants for API endpoints. Defining _internal/security/block-ip as a constant would improve maintainability across both Go and Lua codebases.
| if not ip then | |
| local INTERNAL_BLOCK_IP_ENDPOINT = "_internal/security/block-ip" -- Define as a global constant | |
| if not ip then | |
| ngx.log(ngx.ERR, "block-ip: missing IP in request from ", client_ip) |
| "ttl": ttlSeconds, | ||
| } | ||
| jsonPayload, err := json.Marshal(payload) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to marshal payload: %w", err) | ||
| } | ||
|
|
||
| curlCmd := fmt.Sprintf( | ||
| `curl -s -X POST -H "Content-Type: application/json" -d '%s' http://127.0.0.1:8081/_internal/security/block-ip`, | ||
| string(jsonPayload), | ||
| ) | ||
|
|
||
| cmd := exec.Command("docker", "exec", s.config.Nginx.ContainerName, "sh", "-c", curlCmd) | ||
| output, err := cmd.CombinedOutput() | ||
| if err != nil { | ||
| return fmt.Errorf("failed to notify nginx: %s - %w", string(output), err) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using docker exec curl for inter-container communication is generally discouraged. It introduces overhead, relies on curl and sh being available in the Nginx container, and can have security implications if the curlCmd is not properly sanitized. A more robust and performant solution would be to use a native Go HTTP client to send requests to Nginx's internal API, leveraging Docker's internal networking (e.g., http://nginx_container_name:8081). This would remove external dependencies and simplify error handling.
Additionally, consider defining the internal API path _internal/security/block-ip as a constant.
| "ttl": ttlSeconds, | |
| } | |
| jsonPayload, err := json.Marshal(payload) | |
| if err != nil { | |
| return fmt.Errorf("failed to marshal payload: %w", err) | |
| } | |
| curlCmd := fmt.Sprintf( | |
| `curl -s -X POST -H "Content-Type: application/json" -d '%s' http://127.0.0.1:8081/_internal/security/block-ip`, | |
| string(jsonPayload), | |
| ) | |
| cmd := exec.Command("docker", "exec", s.config.Nginx.ContainerName, "sh", "-c", curlCmd) | |
| output, err := cmd.CombinedOutput() | |
| if err != nil { | |
| return fmt.Errorf("failed to notify nginx: %s - %w", string(output), err) | |
| } | |
| // Recommended: Use a native Go HTTP client for robustness and performance. | |
| // Example (conceptual, actual implementation might vary based on network setup): | |
| // client := &http.Client{Timeout: 5 * time.Second} | |
| // req, err := http.NewRequest("POST", fmt.Sprintf("http://%s:8081/_internal/security/block-ip", s.config.Nginx.ContainerName), bytes.NewBuffer(jsonPayload)) | |
| // if err != nil { return fmt.Errorf("failed to create request: %w", err) } | |
| // req.Header.Set("Content-Type", "application/json") | |
| // resp, err := client.Do(req) | |
| // if err != nil { return fmt.Errorf("failed to notify nginx: %w", err) } | |
| // defer resp.Body.Close() | |
| // if resp.StatusCode != http.StatusOK { /* handle error */ } | |
| // Current implementation using docker exec curl: | |
| curlCmd := fmt.Sprintf( | |
| `curl -s -X POST -H "Content-Type: application/json" -d '%s' http://127.0.0.1:8081/_internal/security/block-ip`, | |
| string(jsonPayload), | |
| ) | |
| cmd := exec.Command("docker", "exec", s.config.Nginx.ContainerName, "sh", "-c", curlCmd) | |
| output, err := cmd.CombinedOutput() | |
| if err != nil { | |
| return fmt.Errorf("failed to notify nginx: %s - %w", string(output), err) | |
| } |
| return fmt.Errorf("failed to marshal payload: %w", err) | ||
| } | ||
|
|
||
| curlCmd := fmt.Sprintf( | ||
| `curl -s -X POST -H "Content-Type: application/json" -d '%s' http://127.0.0.1:8081/_internal/security/unblock-ip`, | ||
| string(jsonPayload), | ||
| ) | ||
|
|
||
| cmd := exec.Command("docker", "exec", s.config.Nginx.ContainerName, "sh", "-c", curlCmd) | ||
| output, err := cmd.CombinedOutput() | ||
| if err != nil { | ||
| return fmt.Errorf("failed to notify nginx: %s - %w", string(output), err) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to notifyNginxBlockIP, using docker exec curl for notifyNginxUnblockIP introduces unnecessary dependencies and overhead. A native Go HTTP client is preferred for inter-process communication within a Docker environment. Also, consider defining the _internal/security/unblock-ip path as a constant.
| return fmt.Errorf("failed to marshal payload: %w", err) | |
| } | |
| curlCmd := fmt.Sprintf( | |
| `curl -s -X POST -H "Content-Type: application/json" -d '%s' http://127.0.0.1:8081/_internal/security/unblock-ip`, | |
| string(jsonPayload), | |
| ) | |
| cmd := exec.Command("docker", "exec", s.config.Nginx.ContainerName, "sh", "-c", curlCmd) | |
| output, err := cmd.CombinedOutput() | |
| if err != nil { | |
| return fmt.Errorf("failed to notify nginx: %s - %w", string(output), err) | |
| } | |
| // Recommended: Use a native Go HTTP client. | |
| // (Similar conceptual implementation as notifyNginxBlockIP) | |
| curlCmd := fmt.Sprintf( | |
| `curl -s -X POST -H "Content-Type: application/json" -d '%s' http://127.0.0.1:8081/_internal/security/unblock-ip`, | |
| string(jsonPayload), | |
| ) | |
| cmd := exec.Command("docker", "exec", s.config.Nginx.ContainerName, "sh", "-c", curlCmd) | |
| output, err := cmd.CombinedOutput() | |
| if err != nil { | |
| return fmt.Errorf("failed to notify nginx: %s - %w", string(output), err) | |
| } |
| func (m *Manager) checkNginxInternalAPIReachable() bool { | ||
| testCmd := "curl -s --connect-timeout 2 --max-time 5 -X POST http://127.0.0.1:8081/_internal/security/refresh-blocked-ips 2>/dev/null | grep -q success && echo yes || echo no" | ||
|
|
||
| cmd := exec.Command("docker", "exec", m.config.Nginx.ContainerName, "sh", "-c", testCmd) | ||
| output, err := cmd.Output() | ||
| if err != nil { | ||
| return false | ||
| } | ||
| return strings.Contains(string(output), "yes") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The checkNginxInternalAPIReachable function also uses docker exec curl. While acceptable for a health check, this still carries the overhead and dependency concerns mentioned for the notifyNginx*IP functions. If curl is not available or its behavior changes in the Nginx container, this health check might fail unexpectedly. Consider a more direct Go-based health check if possible within your container networking setup.
| func (m *Manager) checkNginxInternalAPIReachable() bool { | |
| testCmd := "curl -s --connect-timeout 2 --max-time 5 -X POST http://127.0.0.1:8081/_internal/security/refresh-blocked-ips 2>/dev/null | grep -q success && echo yes || echo no" | |
| cmd := exec.Command("docker", "exec", m.config.Nginx.ContainerName, "sh", "-c", testCmd) | |
| output, err := cmd.Output() | |
| if err != nil { | |
| return false | |
| } | |
| return strings.Contains(string(output), "yes") | |
| } | |
| func (m *Manager) checkNginxInternalAPIReachable() bool { | |
| // Recommended: Use a native Go HTTP client for a more reliable health check. | |
| // Example (conceptual): | |
| // client := &http.Client{Timeout: 2 * time.Second} | |
| // resp, err := client.Post(fmt.Sprintf("http://%s:8081/_internal/security/refresh-blocked-ips", m.config.Nginx.ContainerName), "application/json", nil) | |
| // if err != nil { return false } | |
| // defer resp.Body.Close() | |
| // return resp.StatusCode == http.StatusOK // Or parse body for a specific success message | |
| // Current implementation: | |
| testCmd := "curl -s --connect-timeout 2 --max-time 5 -X POST http://127.0.0.1:8081/_internal/security/refresh-blocked-ips 2>/dev/null | grep -q success && echo yes || echo no" | |
| cmd := exec.Command("docker", "exec", m.config.Nginx.ContainerName, "sh", "-c", testCmd) | |
| output, err := cmd.Output() | |
| if err != nil { | |
| return false | |
| } | |
| return strings.Contains(string(output), "yes") | |
| } |
| _, err := m.db.BlockIP(event.SourceIP, "Auto-blocked due to suspicious activity", &expiresAt, true) | ||
| if err == nil { | ||
| result.AutoBlocked = true | ||
| result.BlockedIP = event.SourceIP | ||
| result.BlockTTL = int(autoBlockDuration.Seconds()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If m.db.BlockIP returns an error during auto-blocking, it's currently suppressed (if err == nil). While the overall IngestEvent might still return a result, this specific error should ideally be logged to ensure visibility of failures in the auto-blocking mechanism. Silent failures can lead to IPs not being blocked as expected.
| _, err := m.db.BlockIP(event.SourceIP, "Auto-blocked due to suspicious activity", &expiresAt, true) | |
| if err == nil { | |
| result.AutoBlocked = true | |
| result.BlockedIP = event.SourceIP | |
| result.BlockTTL = int(autoBlockDuration.Seconds()) | |
| _, err := m.db.BlockIP(event.SourceIP, "Auto-blocked due to suspicious activity", &expiresAt, true) | |
| if err != nil { | |
| // Log the error but don't prevent the event from being returned | |
| log.Printf("Security manager: failed to auto-block IP %s: %v", event.SourceIP, err) | |
| } else { | |
| result.AutoBlocked = true | |
| result.BlockedIP = event.SourceIP | |
| result.BlockTTL = int(autoBlockDuration.Seconds()) | |
| } |
No description provided.