Skip to content

Conversation

@calthejuggler
Copy link
Collaborator

@calthejuggler calthejuggler commented Jan 26, 2026

Previously, our DocSearch scraper was running immediately when a PR was merged. This creates a race condition where the scrape happens on the /previous/ version of the docs site - not the new one.

This PR adds a deployment plugin that, onSuccess, runs the search scraper.

Summary by CodeRabbit

  • New Features
    • Manual trigger for search index updates from deployments, enabled via a new Netlify plugin and CI manual dispatch.
  • Chores
    • Updated Node.js runtime to 24.13.0 and adjusted CI to read the version from the project version file.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 26, 2026

Walkthrough

The PR adds a Netlify plugin (plugins/trigger-algolia-scraper) that, on production builds, POSTs a GitHub workflow dispatch to trigger an Algolia scraper; it requires GITHUB_PAT and reports status via utils.status. CI/workflow changes: algolia workflow trigger changed from push to workflow_dispatch and its step now writes outputs via GITHUB_OUTPUT. The build workflow now reads Node.js version from .nvmrc, and .nvmrc was updated to 24.13.0. netlify.toml registers the new plugin.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐇 I hopped through code at break of dawn,
A plugin carried a tiny GitHub yawn,
On production fields it sends a cheer,
“Run the scraper, bring results near!”
Node bounds to twenty‑four — hooray, go on! 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main objective: adding a DocSearch plugin that runs on deployment to fix the race condition mentioned in the PR description.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@netlify
Copy link

netlify bot commented Jan 26, 2026

Deploy Preview for absmartly-docs ready!

Name Link
🔨 Latest commit d56fb4f
🔍 Latest deploy log https://app.netlify.com/projects/absmartly-docs/deploys/697744b07f806800081cd837
😎 Deploy Preview https://deploy-preview-244--absmartly-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@plugins/trigger-algolia-scraper/index.js`:
- Around line 18-26: Wrap the GitHub dispatch fetch call in a try/catch and add
an AbortController-based timeout: create an AbortController, set a timeout (e.g.
setTimeout to call controller.abort after a configurable ms), pass
controller.signal into fetch, and in the catch block handle AbortError and other
network/DNS errors by logging the error and failing gracefully; also check
response.ok after fetch and log/throw with response status/body on non-2xx.
While editing plugins/trigger-algolia-scraper/index.js, update the payload ref
(currently "master") to use a configurable branch variable or confirm/replace
with your production branch name.
- Around line 16-27: The fetch call in plugins/trigger-algolia-scraper/index.js
hard-codes the workflow ref to "master"; replace the static ref in the body
passed to the dispatch fetch (the JSON.stringify({ ref: "master" }) argument)
with Netlify's BRANCH environment variable (e.g., use process.env.BRANCH,
falling back to "master" if unset) so the dispatched workflow uses the actual
deployed branch. Locate the fetch call and update the ref value in the request
body accordingly.

Comment on lines +16 to +27
console.log("Triggering Algolia scraper workflow...");

const response = await fetch(
"https://api.github.com/repos/absmartly/docs/actions/workflows/algolia.yml/dispatches",
{
method: "POST",
headers: {
Accept: "application/vnd.github+json",
Authorization: `Bearer ${token}`,
"X-GitHub-Api-Version": "2022-11-28",
},
body: JSON.stringify({ ref: "master" }),
Copy link

@coderabbitai coderabbitai bot Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's read the file to see the current state
cat -n plugins/trigger-algolia-scraper/index.js

Repository: absmartly/docs

Length of output: 1830


🏁 Script executed:

# Check for Netlify configuration files
find . -name "netlify.toml" -o -name "netlify.json" -o -name ".netlify" -type f | head -20

Repository: absmartly/docs

Length of output: 72


🏁 Script executed:

# Look for any deployment or CI/CD configuration
find . -name ".github" -o -name ".gitlab-ci.yml" -o -name "azure-pipelines.yml" -o -name ".circleci" | head -20

Repository: absmartly/docs

Length of output: 67


🏁 Script executed:

# Search for any references to BRANCH, HEAD, or other environment variables in the codebase
rg -i "process\.env\.(BRANCH|HEAD|DEPLOY)" --max-count 20

Repository: absmartly/docs

Length of output: 40


🏁 Script executed:

# Read the netlify.toml to check if multiple branches are deployed to production
cat netlify.toml

Repository: absmartly/docs

Length of output: 117


🏁 Script executed:

# Check the GitHub workflows
ls -la .github/workflows/

Repository: absmartly/docs

Length of output: 354


🏁 Script executed:

# Search for context or branch references in the entire codebase
rg -i "context.*production|branch" netlify.toml .github/ | head -30

Repository: absmartly/docs

Length of output: 2786


🏁 Script executed:

# Read the algolia.yml workflow to see what it expects
cat .github/workflows/algolia.yml

Repository: absmartly/docs

Length of output: 656


🏁 Script executed:

# Read the build.yml to check production deployment configuration
cat .github/workflows/build.yml

Repository: absmartly/docs

Length of output: 645


🏁 Script executed:

# Search for Netlify configuration related to production deployments
rg "production|branch" netlify.toml

Repository: absmartly/docs

Length of output: 40


🌐 Web query:

Netlify environment variables BRANCH production deploy

💡 Result:

On Netlify, “production vs branch deploy” is determined by the deploy context, not by whether BRANCH is set.

  • Detect production deploys

    • Use the built-in read-only env var CONTEXT:
      • CONTEXT=production for deploys from your site’s configured production branch
      • CONTEXT=branch-deploy for deploys from non-production branches
      • (also deploy-preview for PR previews, dev for Netlify Dev) [1][2]
  • What BRANCH means

    • BRANCH is a built-in read-only variable that contains the Git branch name being built (including on production deploys, where it will be your production branch like main) [2].
  • Setting different env var values for production vs branch deploys

    • In the Netlify UI you can set contextual values per deploy context (Production, Branch deploys, Deploy Previews, etc.) and optionally override for a specific branch (or a wildcard like release/*). The branch-specific value wins for that branch’s deploys. [3][4]
  • Doing it in netlify.toml

    • You can set context-specific env vars like:
      • [context.production.environment] for production
      • [context.branch-deploy.environment] for branch deploys
      • [context.<branchname>.environment] for a specific branch (most specific wins) [5]

Sources:
[1] Netlify Docs — Build environment variables (read-only CONTEXT, BRANCH)
[2] Netlify Docs — File-based configuration (context names and what “production” means)
[3] Netlify Docs — Environment variables overview (contextual values, branch overrides, wildcards)
[4] Netlify Blog — Introducing Deploy Contexts
[5] Netlify Docs — Get started with environment variables (netlify.toml context env examples)


Use the deployed branch name instead of hard-coding "master".

If production deploys from both the master and development branches (as shown in build.yml), dispatching the workflow to a static ref will fail on deployments from non-master branches. Use Netlify's BRANCH environment variable, which contains the actual Git branch being deployed:

♻️ Suggested ref handling
   const token = process.env.GITHUB_PAT;
   if (!token) {
     utils.status.show({
       title: "Algolia scraper trigger skipped",
       summary: "GITHUB_PAT environment variable is not set.",
     });
     return;
   }
 
+  const ref = process.env.BRANCH || "master";
   console.log("Triggering Algolia scraper workflow...");
@@
-      body: JSON.stringify({ ref: "master" }),
+      body: JSON.stringify({ ref }),
🤖 Prompt for AI Agents
In `@plugins/trigger-algolia-scraper/index.js` around lines 16 - 27, The fetch
call in plugins/trigger-algolia-scraper/index.js hard-codes the workflow ref to
"master"; replace the static ref in the body passed to the dispatch fetch (the
JSON.stringify({ ref: "master" }) argument) with Netlify's BRANCH environment
variable (e.g., use process.env.BRANCH, falling back to "master" if unset) so
the dispatched workflow uses the actual deployed branch. Locate the fetch call
and update the ref value in the request body accordingly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deployments from development are not production deploys - only master should trigger the doc search scrape

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@calthejuggler, thank you for the clarification! You're absolutely right—if only master is deployed to the production context, then hard-coding ref: "master" is the correct approach. The CONTEXT === "production" check already ensures the workflow only triggers for production deploys from master, making the static ref appropriate for your deployment model.


✏️ Learnings added
Learnt from: calthejuggler
Repo: absmartly/docs PR: 244
File: plugins/trigger-algolia-scraper/index.js:16-27
Timestamp: 2026-01-26T10:45:26.591Z
Learning: In the absmartly/docs repository, only the `master` branch is deployed to Netlify's production context. Deployments from the `development` branch are not production deploys and should not trigger the Algolia DocSearch scraper.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment on lines +18 to +26
const response = await fetch(
"https://api.github.com/repos/absmartly/docs/actions/workflows/algolia.yml/dispatches",
{
method: "POST",
headers: {
Accept: "application/vnd.github+json",
Authorization: `Bearer ${token}`,
"X-GitHub-Api-Version": "2022-11-28",
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n plugins/trigger-algolia-scraper/index.js

Repository: absmartly/docs

Length of output: 1830


Guard the GitHub dispatch call against network failures and timeouts.

The fetch call (lines 18–29) lacks error handling for network failures, timeouts, and DNS errors. If the request fails, the exception propagates uncaught and will fail the entire deploy. Additionally, there is no timeout mechanism, risking indefinite hangs on slow or unresponsive networks. Add a try–catch block with AbortController-based timeout handling.

The ref is also hard-coded to "master" (line 27); confirm this matches your production deployment branch.

♻️ Suggested hardening
   console.log("Triggering Algolia scraper workflow...");

-  const response = await fetch(
+  const controller = new AbortController();
+  const timeout = setTimeout(() => controller.abort(), 15_000);
+  let response;
+  try {
+    response = await fetch(
       "https://api.github.com/repos/absmartly/docs/actions/workflows/algolia.yml/dispatches",
       {
         method: "POST",
         headers: {
           Accept: "application/vnd.github+json",
           Authorization: `Bearer ${token}`,
           "X-GitHub-Api-Version": "2022-11-28",
+          "Content-Type": "application/json",
         },
         body: JSON.stringify({ ref: "master" }),
+        signal: controller.signal,
       },
-    );
+    );
+  } catch (err) {
+    console.error("Failed to trigger Algolia scraper:", err);
+    utils.status.show({
+      title: "Algolia scraper trigger failed",
+      summary: `Request failed: ${err?.message ?? err}`,
+    });
+    return;
+  } finally {
+    clearTimeout(timeout);
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const response = await fetch(
"https://api.github.com/repos/absmartly/docs/actions/workflows/algolia.yml/dispatches",
{
method: "POST",
headers: {
Accept: "application/vnd.github+json",
Authorization: `Bearer ${token}`,
"X-GitHub-Api-Version": "2022-11-28",
},
console.log("Triggering Algolia scraper workflow...");
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 15_000);
let response;
try {
response = await fetch(
"https://api.github.com/repos/absmartly/docs/actions/workflows/algolia.yml/dispatches",
{
method: "POST",
headers: {
Accept: "application/vnd.github+json",
Authorization: `Bearer ${token}`,
"X-GitHub-Api-Version": "2022-11-28",
"Content-Type": "application/json",
},
body: JSON.stringify({ ref: "master" }),
signal: controller.signal,
},
);
} catch (err) {
console.error("Failed to trigger Algolia scraper:", err);
utils.status.show({
title: "Algolia scraper trigger failed",
summary: `Request failed: ${err?.message ?? err}`,
});
return;
} finally {
clearTimeout(timeout);
}
🤖 Prompt for AI Agents
In `@plugins/trigger-algolia-scraper/index.js` around lines 18 - 26, Wrap the
GitHub dispatch fetch call in a try/catch and add an AbortController-based
timeout: create an AbortController, set a timeout (e.g. setTimeout to call
controller.abort after a configurable ms), pass controller.signal into fetch,
and in the catch block handle AbortError and other network/DNS errors by logging
the error and failing gracefully; also check response.ok after fetch and
log/throw with response status/body on non-2xx. While editing
plugins/trigger-algolia-scraper/index.js, update the payload ref (currently
"master") to use a configurable branch variable or confirm/replace with your
production branch name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants