Skip to content

Conversation

@smallseacreature
Copy link
Contributor

@smallseacreature smallseacreature commented Dec 22, 2025

Problem

httpx can allocate excessive memory and eventually OOM when scanning services that return unbounded response bodies (e.g., CHARGEN / infinite streams).

Root cause

The CLI defaults for -rstr / -rsts are math.MaxInt32 (~2GB). This effectively allows multi-GB in-memory reads per response, and raw response handling can amplify memory usage due to additional buffering and copies.

Fix

  • Set safer default maximum response sizes for -rstr / -rsts to 10MB.
  • Ensure raw response construction uses the bounded body read (via the configured max) rather than paths that can consume large or unbounded bodies.

Docs

README default values for -rstr / -rsts updated accordingly.

Repro / Verification

Using a local infinite-stream server at http://127.0.0.1:8095, httpx now exits normally without OOM using default settings.

Fixes #1964.

Summary by CodeRabbit

  • New Features

    • Response body reading is now capped at a configurable size limit (default: 512MB) to prevent excessive memory consumption.
  • Refactor

    • Internal code improvements for URL scheme handling.
  • Chores

    • Added dependency for byte size parsing utilities.

✏️ Tip: You can customize this high-level summary in your review settings.

@auto-assign auto-assign bot requested a review from Mzack9999 December 22, 2025 19:50
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
common/httpx/httpx.go (1)

258-271: Bounded body reading is correct, but gzip retry logic needs review.

The bounded body read using io.LimitReader correctly prevents OOM by capping memory allocation. However, there are concerns with the gzip retry logic:

  1. Inconsistency with PR summary: The AI-generated summary states "Removes the prior edge-case handling for a gzip header mismatch (which retried the request with Accept-Encoding set to identity and used a goto control flow)", but this logic is still present in lines 261-267.

  2. Code smell - goto usage: The goto statement makes control flow harder to follow and is generally discouraged in Go. Consider refactoring to use a loop or helper function.

  3. Resource cleanup: When retrying via goto get_response, ensure the original httpresp.Body from the failed attempt is properly closed to avoid resource leaks.

Please verify:

  1. Is the gzip retry logic still needed with the new bounded-body approach?
  2. Is the original response body properly closed before the retry?
Optional refactor to eliminate goto

Consider restructuring the code to use a retry loop instead:

const maxAttempts = 2
for attempt := 0; attempt < maxAttempts; attempt++ {
    if attempt == 1 {
        // Second attempt: force identity encoding
        req.Header.Set("Accept-Encoding", "identity")
    }
    
    httpresp, err = h.getResponse(req, unsafeOptions)
    if httpresp == nil && err != nil {
        return nil, err
    }
    
    // ... rest of logic ...
    
    respbody, err = io.ReadAll(io.LimitReader(httpresp.Body, h.Options.MaxResponseBodySizeToRead))
    if err != nil {
        if attempt == 0 && strings.Contains(err.Error(), "gzip: invalid header") {
            httpresp.Body.Close() // Clean up before retry
            continue
        }
        if !shouldIgnoreBodyErrors {
            return nil, err
        }
    }
    break // Success
}
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 77679db and 8569576.

📒 Files selected for processing (3)
  • README.md
  • common/httpx/httpx.go
  • runner/options.go
🧰 Additional context used
🧬 Code graph analysis (1)
common/httpx/httpx.go (2)
runner/options.go (1)
  • Options (173-358)
common/httpx/option.go (1)
  • Options (13-57)
🔇 Additional comments (5)
README.md (1)

257-258: LGTM! Documentation updated to reflect new safe defaults.

The default maximum response size values are correctly updated to 10MB (10485760 bytes), which aligns with the code changes in runner/options.go. This documentation accurately reflects the new defaults that prevent OOM issues when scanning services with unbounded response bodies.

common/httpx/httpx.go (3)

10-10: Import addition is appropriate.

The net/http/httputil import is correctly added to support the header-only response dump using httputil.DumpResponse.


239-240: Header-only dump prevents unbounded memory allocation.

Switching to httputil.DumpResponse(httpresp, false) correctly dumps only the response headers without consuming the body. This is a crucial change that prevents OOM when the response body is unbounded, as the body is now read separately with explicit size limits.


274-280: Excellent approach to building bounded raw response.

The manual construction of resp.Raw from headers and bounded body is correct and well-implemented:

  1. Proper capacity pre-allocation: make([]byte, 0, len(headers)+len(respbody)) avoids unnecessary reallocations
  2. Clear comment: Explains the critical requirement to use capped body data
  3. Correct concatenation: Headers followed by body matches HTTP response format

This ensures that resp.Raw never contains unbounded data, which is the core fix for the OOM issue.

runner/options.go (1)

543-545: Safer defaults implemented correctly.

The 10MB default for both MaxResponseBodySizeToSave and MaxResponseBodySizeToRead is a significant safety improvement to prevent OOM when scanning services with unbounded streams. The MaxResponseBodySizeToRead default aligns with common/httpx; however, MaxResponseBodySizeToSave has no explicit default in common/httpx and is explicitly set here for consistency.

Note: This is a breaking change for users who relied on previous defaults allowing larger responses, but it's essential for stability and resource-constrained environments.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
common/stringz/stringz.go (1)

88-93: LGTM! Clean refactor to switch statement.

The conversion from if-else to switch improves readability without changing functionality. The use of urlutil.HTTP and urlutil.HTTPS constants is appropriate.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8569576 and 78bf9c3.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (7)
  • README.md
  • cmd/httpx/resume.cfg
  • common/httpx/httpx.go
  • common/httpx/option.go
  • common/stringz/stringz.go
  • go.mod
  • runner/options.go
✅ Files skipped from review due to trivial changes (2)
  • cmd/httpx/resume.cfg
  • README.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • common/httpx/httpx.go
🧰 Additional context used
🧬 Code graph analysis (2)
runner/options.go (2)
common/httpx/option.go (1)
  • DefaultMaxResponseBodySize (14-14)
common/httpx/proto.go (2)
  • UNKNOWN (6-6)
  • HTTP11 (7-7)
common/stringz/stringz.go (1)
common/httpx/http2.go (2)
  • HTTP (15-15)
  • HTTPS (17-17)
🔇 Additional comments (3)
common/httpx/option.go (1)

78-78: Usage of DefaultMaxResponseBodySize is correct.

The assignment properly references the new default variable, making the configuration more maintainable. However, ensure the concerns raised about the initialization (lines 16-19) are addressed.

runner/options.go (1)

777-777: LGTM! Protocol validation correctly updated.

The protocol validation now properly uses httpxcommon.UNKNOWN and httpxcommon.HTTP11 constants, maintaining consistency with the aliased import.

go.mod (1)

54-54: LGTM! Dependency addition to support human-readable byte size parsing.

The addition of github.com/dustin/go-humanize v1.0.1 enables parsing of human-friendly byte sizes (e.g., "512Mb") used in common/httpx/option.go to set DefaultMaxResponseBodySize.

@Mzack9999 Mzack9999 added the Type: Bug Inconsistencies or issues which will cause an issue or problem for users or implementors. label Dec 24, 2025
@projectdiscovery projectdiscovery deleted a comment from coderabbitai bot Dec 24, 2025
@Mzack9999 Mzack9999 merged commit 5a793ca into projectdiscovery:dev Dec 24, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Type: Bug Inconsistencies or issues which will cause an issue or problem for users or implementors.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

httpx crashes on services generating endless stream of data

2 participants