Skip to content

Conversation

@github-actions
Copy link
Contributor

This is an automated pull request to release the candidate branch into production, which will trigger a deployment.
It was created by the [Production PR] action.

…1833)

Co-authored-by: Tofik Hasanov <annexcies@gmail.com>
@comp-ai-code-review
Copy link

comp-ai-code-review bot commented Nov 25, 2025

🔒 Comp AI - Security Review

🔴 Risk Level: HIGH

OSV scan: 3 advisories — xlsx@0.18.5 (2 HIGH: GHSA-4r6h-8v6p-xvw6, GHSA-5pgg-2g8v-p4x9); ai@5.0.0 (LOW: GHSA-rwvc-j5jr-mgvh, fixed in 5.0.52).


📦 Dependency Vulnerabilities

🟠 NPM Packages (HIGH)

Risk Score: 8/10 | Summary: 2 high, 1 low CVEs found

Package Version CVE Severity CVSS Summary Fixed In
xlsx 0.18.5 GHSA-4r6h-8v6p-xvw6 HIGH N/A Prototype Pollution in sheetJS No fix yet
xlsx 0.18.5 GHSA-5pgg-2g8v-p4x9 HIGH N/A SheetJS Regular Expression Denial of Service (ReDoS) No fix yet
ai 5.0.0 GHSA-rwvc-j5jr-mgvh LOW N/A Vercel’s AI SDK's filetype whitelists can be bypassed when uploading files 5.0.52

🛡️ Code Security Analysis

View 1 file(s) with issues

🔴 apps/app/src/jobs/tasks/vendors/parse-questionnaire.ts (HIGH Risk)

# Issue Risk Level
1 Unvalidated URL sent to Firecrawl (SSRF risk) HIGH
2 No file size or MIME type checks before parsing (DoS risk) HIGH
3 Unbounded base64 image data sent to LLM (resource abuse) HIGH
4 XLSX.read used without safeguards (zip bomb / memory exhaustion) HIGH

Recommendations:

  1. Validate and allowlist URLs before sending to Firecrawl. Reject or canonicalize URLs that resolve to private IP ranges (169.254/127.0.0.0/8/10/172.16/192.168/etc.), localhost, or non-HTTP(S) schemes; enforce a domain allowlist for third-party crawl requests.
  2. Enforce strict file size limits early (both client-side and server-side). Reject uploads and S3/attachment reads above a configured maximum (e.g., 10–50 MB depending on use case). Log and return clear errors to callers.
  3. Validate MIME/type by inspecting magic bytes (content sniffing) in addition to client-provided fileType. Reject mismatches and unsupported types before parsing.
  4. Do not pass arbitrarily large base64 image/PDF payloads directly to the LLM. Enforce size limits and either: (a) reject oversized images, (b) downsample/resize images, or (c) run OCR on server with limits and send only extracted text (or chunked representations) to the LLM.
  5. Apply timeouts, rate limits and concurrency limits when calling external services (Firecrawl, LLM). Add retries with backoff and a maximum total wait time to prevent resource exhaustion.
  6. Harden Excel parsing: enforce maximum buffer/file size before calling XLSX.read, and consider using streaming/limited parsers or scanning ZIP entry sizes to detect zip-bomb patterns. Limit number of rows/cells processed and abort parsing if thresholds exceeded.
  7. Avoid logging raw payloads or full base64 content. Redact sensitive content and limit previews (e.g., only first N characters) in logs. Ensure environment variables and secrets are never logged.
  8. Add server-side input validation and sanitization for all task inputs (payload.url, fileData, s3Key, attachmentId). Consider adding a validation layer that asserts allowed size, types, and formats before any costly operations.

💡 Recommendations

View 3 recommendation(s)
  1. Upgrade the ai package to >= 5.0.52 to remediate GHSA-rwvc-j5jr-mgvh (filetype whitelist bypass). Update package.json to require the patched version and rebuild dependencies.
  2. Remediate xlsx@0.18.5: upgrade to a release that addresses GHSA-4r6h-8v6p-xvw6 (Prototype Pollution) and GHSA-5pgg-2g8v-p4x9 (ReDoS). If a patched xlsx is not available immediately, remove or avoid parsing untrusted input with xlsx and replace with a safe parser or an isolated processing path.
  3. Audit code sites that call XLSX APIs (e.g., XLSX.read) and enforce input constraints before invoking the library: reject oversized files, verify file magic bytes/MIME, and sanitize or limit parsed content to reduce exploitation surface while dependency updates are applied.

Powered by Comp AI - AI that handles compliance for you. Reviewed Nov 25, 2025

@vercel
Copy link

vercel bot commented Nov 25, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
app (staging) Building Building Nov 25, 2025 10:29pm
1 Skipped Deployment
Project Deployment Preview Comments Updated (UTC)
portal (staging) Skipped Skipped Nov 25, 2025 10:29pm

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@Marfuen Marfuen merged commit c3eb0f8 into release Nov 25, 2025
8 of 13 checks passed
@claudfuen
Copy link
Contributor

🎉 This PR is included in version 1.64.1 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants