-
Notifications
You must be signed in to change notification settings - Fork 121
Grida - Multi-Tenant BYOD - Bring Your Own Domain #515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Introduced a new index file for the Platform working group, outlining infrastructure topics. - Added a detailed document on Multi-tenant Custom Domains on Vercel, covering architectural decisions, domain ownership models, and routing semantics.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
WalkthroughImplements a comprehensive custom domain management system for multi-tenant sites on Vercel, including database schema for domain records, API routes for add/verify/delete/refresh operations, RLS policies, hostname validation utilities, and UI components for domain management with DNS instructions and canonical domain handling. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client (Browser)
participant Editor as Editor API
participant DB as Supabase DB
participant Vercel as Vercel API
participant Cache as Next.js Cache
rect rgba(100, 150, 255, 0.5)
Note over Client,Cache: Domain Add Flow
Client->>Editor: POST /domains (hostname)
Editor->>Editor: Validate hostname
Editor->>Vercel: projectsAddProjectDomain(hostname)
Vercel-->>Editor: domain object
Editor->>Vercel: projectsGetProjectDomain(hostname)
Vercel-->>Editor: domain config
Editor->>DB: INSERT domain row (pending)
DB-->>Editor: domain record
Editor->>Cache: revalidateTag(domain-registry)
Editor-->>Client: 200 {www, domain, vercel}
end
rect rgba(100, 200, 150, 0.5)
Note over Client,Cache: Domain Refresh/Verify Flow
Client->>Editor: POST /domains/[hostname]/refresh
Editor->>DB: GET domain row
Editor->>Vercel: projectsGetProjectDomain(hostname)
Vercel-->>Editor: current domain state
Editor->>Vercel: projectsVerifyProjectDomain(hostname)
Vercel-->>Editor: verification status
Editor->>DB: UPDATE domain (status, vercel_data)
Editor->>Cache: revalidateTag(domain-registry)
Editor-->>Client: 200 {domain with updated status}
end
rect rgba(200, 150, 100, 0.5)
Note over Client,Cache: Canonical Domain Set Flow
Client->>Editor: POST /domains/[hostname]/canonical
Editor->>DB: Verify domain active
Editor->>DB: UPDATE all domains canonical=false (www_id)
DB-->>Editor: cleared
Editor->>DB: UPDATE target domain canonical=true
Editor->>Cache: revalidateTag(domain-registry)
Editor-->>Client: 200 {www, domain, primary}
end
sequenceDiagram
participant Request as HTTP Request
participant Middleware as Middleware (proxy.ts)
participant Router as TenantMiddleware.routeProxyRequest
participant Resolver as resolveHostCached
participant Vercel as Vercel / Custom DNS
participant DB as Supabase DB
participant App as Tenant App
rect rgba(150, 100, 200, 0.5)
Note over Request,App: Host-Based Routing Flow
Request->>Middleware: req.headers.host
Middleware->>Router: routeProxyRequest(req, res)
alt Localhost Development
Router->>Router: Extract tenant from localhost:port
Router->>App: Rewrite to /~/{tenant}
else Platform Host
Router->>Resolver: resolveHostCached(host)
Resolver->>DB: RPC www_get_canonical_hostname
DB-->>Resolver: canonical hostname
alt Not canonical
Router-->>Request: 301 Redirect to canonical
else Canonical
Router->>App: Rewrite to /~/{tenant}
end
else Custom Domain
Router->>Resolver: resolveHostCached(host)
Resolver->>DB: RPC www_resolve_hostname
DB-->>Resolver: www_id, www_name, canonical
alt Cached/Available
Router->>App: Rewrite to /~/{tenant}
else Not Found
Router->>App: Continue (404)
end
end
Router-->>Middleware: NextResponse | null
Middleware->>Request: Response
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…d layout and add error handling for clipboard copy operation
…ew DomainsPage and CustomDomainsSection. Remove deprecated SiteDomainsSection and related domain handling logic from ProjectWWWSettingsPage.
… for domain mapping, new SQL functions for hostname resolution, and comprehensive tests for role-based access control. Update type definitions to reflect new domain structure and enhance documentation with edge cases and operational guidelines.
… for domain verification, refresh, and canonicalization. Implement domain handling logic in the Vercel client, including domain addition and removal. Update environment configuration with new required variables and improve type definitions for better clarity. Add tests for domain-related utilities to ensure robustness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a53838dd35
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| -- derived: apex vs subdomain | ||
| kind TEXT GENERATED ALWAYS AS ( | ||
| CASE | ||
| WHEN array_length(string_to_array(hostname, '.'), 1) = 2 THEN 'apex' | ||
| ELSE 'subdomain' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle multi-part TLDs when deriving domain kind
The computed kind column treats a hostname as “apex” only when it has exactly one dot. That misclassifies legitimate apex domains under multi‑part public suffixes (e.g. example.co.uk, example.com.au) as subdomain. The UI uses kind to decide whether to show A‑record vs CNAME instructions, so these apex domains will get invalid CNAME guidance and can’t be configured correctly. Consider deriving kind using a public‑suffix list (or storing kind based on validated user input) instead of dot‑counting.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
supabase/schemas/grida_www.sql (1)
208-217:⚠️ Potential issue | 🟠 MajorSECURITY DEFINER function missing safe
search_pathand privilege lockdown.
grida_www.rls_wwwis markedSECURITY DEFINERbut doesn't:
- Set a safe
search_path(e.g.,SET search_path = pg_catalog, public)- Have
REVOKE ALL ... FROM PUBLICfollowed by explicit grantsThis creates potential search_path injection risks and overly permissive access.
Proposed fix
CREATE OR REPLACE FUNCTION grida_www.rls_www(p_www_id UUID) RETURNS BOOLEAN AS $$ BEGIN RETURN EXISTS ( SELECT 1 FROM grida_www.www w WHERE w.id = p_www_id AND public.rls_project(w.project_id) ); END; -$$ LANGUAGE plpgsql STABLE SECURITY DEFINER; +$$ LANGUAGE plpgsql STABLE SECURITY DEFINER +SET search_path = pg_catalog, public, grida_www; + +REVOKE ALL ON FUNCTION grida_www.rls_www(UUID) FROM PUBLIC; +GRANT EXECUTE ON FUNCTION grida_www.rls_www(UUID) TO authenticated, service_role;As per coding guidelines: "For SECURITY DEFINER functions: set a safe search_path to
pg_catalog, public, fully qualify table/function names, validate inputs including tenant boundary, lock down privileges" and "SECURITY DEFINER functions must have REVOKE ALL on the function FROM PUBLIC, then grant EXECUTE only to required roles".database/database.types.ts (1)
92-99:⚠️ Potential issue | 🟡 MinorFix
www_publicoverride shape (Row nested incorrectly).
Right nowwww_publicintersects the Row type with{ Row: ... }, which adds Row fields at the view object level instead of overridingRow. It should mirror thepublic_routepattern.Proposed fix
- Views: { - www_public: DatabaseGenerated["grida_www"]["Views"]["www_public"]["Row"] & { - Row: { - id: string; - name: string; - favicon: SystemSchema_Favicon | null; - }; - }; + Views: { + www_public: { + Row: DatabaseGenerated["grida_www"]["Views"]["www_public"]["Row"] & { + id: string; + name: string; + favicon: SystemSchema_Favicon | null; + }; + };editor/proxy.ts (1)
15-15:⚠️ Potential issue | 🟡 MinorLine 24 has multiple typos:
[CONTRIBUTER MODE]→[CONTRIBUTOR MODE]andBackedn→Backend.The console.warn message contains spelling errors that should be corrected.
🤖 Fix all issues with AI agents
In @.github/workflows/database-tests.yml:
- Around line 20-22: Update the Supabase CLI action reference to use the latest
stable release by changing the version value used with supabase/setup-cli@v1
from "2.72.7" to "2.75.0"; locate the workflow step that invokes
supabase/setup-cli@v1 and update the with.version field to "2.75.0" and then
re-run the database tests to confirm compatibility.
In `@docs/wg/platform/multi-tenant-custom-domain-vercel.md`:
- Around line 80-84: Update the "**Vercel requirement**" section that currently
hardcodes the anycast IP `76.76.21.21` to instead instruct users to obtain the
exact A-record IP from their Vercel project's Domain Settings (project-specific
IP selected from Vercel's optimized anycast pool); note that `76.76.21.21` can
be mentioned as a possible legacy fallback but mark the recommended best
practice as using the project-specific IP provided in Domain Settings and adjust
the explanatory sentence accordingly.
In
`@editor/app/`(api)/private/~/[org]/[proj]/www/domains/[hostname]/canonical/route.ts:
- Around line 68-91: The two sequential updates using
wwwClient.from("domain").update({ canonical: false }).eq("www_id", www.id) and
wwwClient.from("domain").update({ canonical: true }).eq("id", target.id) are not
atomic and can leave no canonical domain if the second update fails; change this
to an atomic operation—either run both updates inside a database transaction (so
you rollback on failure) or replace them with a single conditional update/RPC
(e.g., a single UPDATE that sets canonical = (id = target_id) WHERE www_id =
www_id) invoked via wwwClient, and ensure you check and return any
transaction/RPC errors the same way you currently handle clear_err/set_err.
In `@editor/app/`(api)/private/~/[org]/[proj]/www/domains/[hostname]/route.ts:
- Around line 81-106: The current sequence deletes the Vercel domain via
projectsRemoveProjectDomain before removing the DB row
(wwwClient.from("domain").delete().eq("id", domain_row.id)), which can leave an
orphaned DB record on DB failure; change the flow to delete the DB row first,
then call projectsRemoveProjectDomain, and if the Vercel removal fails attempt a
compensating rollback by re-inserting the saved domain_row via
wwwClient.from("domain").insert(domain_row) (or wrap both operations in a
transaction/atomic workflow), and return appropriate NextResponse.json errors on
each failure path.
In `@editor/app/`(workbench)/[org]/[proj]/(console)/(resources)/domains/page.tsx:
- Around line 24-38: The SWR cache key "site-domains" is global and must be made
project-scoped to avoid cross-project collisions: change the key used in useSWR
(the variable key) to include the project.id (e.g. `${project.id}:site-domains`)
and update any mutate calls inside changeDomainName that reference the old
"site-domains" key to use the new scoped key; also review other related keys
such as "site" to ensure they are similarly namespaced to project.id so tab
synchronization stays correct.
In
`@editor/app/`(workbench)/[org]/[proj]/(console)/(resources)/domains/section-domains.tsx:
- Around line 633-712: The dialog's local state (name, error, busy) is
initialized once with useState(defaultName) and not reset when the dialog
reopens or defaultName changes; update RenamePlatformDomainDialog by adding an
effect (useEffect) that runs when open or defaultName changes and resets
setName(defaultName), setError(null) and setBusy(false) (or appropriate initial
values) so the input and error message are fresh on each open/defaultName
update; keep existing handlers (onSubmitHandler, Input onChange) unchanged.
In `@editor/lib/domains/index.ts`:
- Around line 44-49: The current getDomainKind(hostname: string) uses a
dot-count heuristic that misclassifies multi‑part TLDs; replace it with a Public
Suffix List–based check (e.g., using a PSL parser like the "psl" or "tldts"
package) to compute the eTLD+1 for the hostname and then return "apex" when the
eTLD+1 equals the original hostname, otherwise "subdomain"; update references to
DomainKind as needed and add/adjust unit tests to cover examples like
example.co.uk and app.example.co.uk.
In `@editor/lib/tenant/middleware.ts`:
- Around line 137-171: The internal resolver fetch (triggered when internalToken
&& deployHost, building new URL("/internal/resolve-host", deployHost)) needs a
timeout: create an AbortController, start a timeout (e.g. 2s) that calls
controller.abort(), pass controller.signal into the fetch options (headers +
signal), and clear the timeout after the fetch completes or fails; ensure the
catch block handles aborts gracefully (same fallback to DB resolution). This
change should be applied around the existing fetch call so the fetch is
cancelled on timeout and resources are cleaned up.
🧹 Nitpick comments (21)
.devcontainer/universal/devcontainer.json (1)
3-7: Pin the base image and feature versions for reproducibility.
mcr.microsoft.com/devcontainers/base:ubuntuand:1feature tags are floating; upstream changes can silently alter or break dev environments. Consider pinning to a specific Ubuntu LTS tag (or digest) and concrete feature versions/digests, then upgrade intentionally.editor/app/(workbench)/[org]/[proj]/(console)/(new)/new/referral/welcome-dialog.tsx (1)
18-22: Consider addressing duplicate visible content and improving the title text.Two observations:
The
AlertDialogTitletext "WelcomeDialog" is a component name rather than a meaningful title. For screen reader users, a more descriptive title like "Welcome to Grida WEST" would be clearer.The
AlertDialogDescriptioncontent (lines 19-22) is identical to the paragraph at lines 44-47, creating duplicate visible text. Consider either:
- Adding
sr-onlyto theAlertDialogDescriptionif it's only meant for ARIA association- Removing the duplicate paragraph from the article section
♻️ Proposed fix
- <AlertDialogTitle className="sr-only">WelcomeDialog</AlertDialogTitle> - <AlertDialogDescription> + <AlertDialogTitle className="sr-only">Welcome to Grida WEST</AlertDialogTitle> + <AlertDialogDescription className="sr-only"> You're about to launch a referral campaign powered by{" "} <strong>Grida WEST</strong> — the easiest way to grow through sharing. </AlertDialogDescription>editor/components/copy-to-clipboard-input/index.tsx (1)
18-27: Consider clearing the previous timeout on repeated clicks.If the user clicks the copy button multiple times in quick succession, multiple timeouts accumulate. The icon will reset 2 seconds after the first click rather than the last, which may feel slightly inconsistent.
♻️ Optional: Clear previous timeout for consistent UX
export function CopyToClipboardInput({ value }: { value: string }) { const inputId = React.useId(); const [isCopied, setIsCopied] = React.useState(false); + const timeoutRef = React.useRef<ReturnType<typeof setTimeout> | null>(null); const onCopyClick = React.useCallback(async () => { try { await navigator.clipboard.writeText(value); setIsCopied(true); toast.success("Copied to clipboard"); + if (timeoutRef.current) clearTimeout(timeoutRef.current); - setTimeout(() => setIsCopied(false), 2000); + timeoutRef.current = setTimeout(() => setIsCopied(false), 2000); } catch { toast.error("Failed to copy"); } }, [value]);supabase/schemas/grida_www.sql (1)
148-149: Consider usingFORCE ROW LEVEL SECURITYand adding an index onwww_id.Two concerns with the RLS setup:
Only
ENABLE ROW LEVEL SECURITYis used. Per coding guidelines, prefer also usingFORCE ROW LEVEL SECURITYto ensure RLS applies even to table owners.The RLS policy uses
grida_www.rls_www(www_id), but there's no explicit index onwww_id. Whilewww_idhas a FK constraint, adding an index improves RLS scan performance when the FK index isn't optimal for the policy predicate.Proposed fix
ALTER TABLE grida_www.domain ENABLE ROW LEVEL SECURITY; +ALTER TABLE grida_www.domain FORCE ROW LEVEL SECURITY; + +CREATE INDEX IF NOT EXISTS grida_www_domain_www_id_idx ON grida_www.domain (www_id); + CREATE POLICY "access_based_on_www_editor" ON grida_www.domain USING (grida_www.rls_www(www_id)) WITH CHECK (grida_www.rls_www(www_id));As per coding guidelines: "Enable Row Level Security (RLS) explicitly and prefer forcing it with
ALTER TABLE ... ENABLE ROW LEVEL SECURITY;andALTER TABLE ... FORCE ROW LEVEL SECURITY;" and "Index columns that RLS policies depend on (project_id, org_id, owner_id, membership join keys, etc.) to avoid slow RLS scans".supabase/migrations/20260202120000_grida_www_custom_domains.sql (1)
49-56: AddFORCE ROW LEVEL SECURITYand index onwww_id.Same recommendations as the schema file apply here:
- Add
FORCE ROW LEVEL SECURITYafter enabling RLS- Add an index on
www_idto optimize RLS policy evaluationProposed fix
ALTER TABLE grida_www.domain ENABLE ROW LEVEL SECURITY; +ALTER TABLE grida_www.domain FORCE ROW LEVEL SECURITY; + +-- Index for RLS policy performance +CREATE INDEX IF NOT EXISTS grida_www_domain_www_id_idx ON grida_www.domain (www_id); -- Editors: manage domains for a www they can access. DROP POLICY IF EXISTS access_based_on_www_editor ON grida_www.domain;As per coding guidelines: "Enable Row Level Security (RLS) explicitly and prefer forcing it with
ALTER TABLE ... ENABLE ROW LEVEL SECURITY;andALTER TABLE ... FORCE ROW LEVEL SECURITY;".supabase/tests/test_grida_www_domain_rls_test.sql (2)
177-249: Consider adding a DELETE isolation test.The test suite covers read, insert, and update isolation but is missing a DELETE isolation test. Per coding guidelines, write isolation tests should include insert/update/delete.
Example DELETE test to add
-- 14) Insider can delete a local domain row they inserted SELECT test_set_auth('insider@grida.co'); DO $$ DECLARE rc integer; BEGIN BEGIN DELETE FROM grida_www.domain WHERE www_id = current_setting('test.www_id_local')::uuid AND hostname = 'app.example.com'; GET DIAGNOSTICS rc = ROW_COUNT; PERFORM set_config('test.delete_insider_rowcount', rc::text, true); EXCEPTION WHEN others THEN PERFORM set_config('test.delete_insider_rowcount', '-1', true); END; END $$; SELECT is(current_setting('test.delete_insider_rowcount'), '1', 'insider can delete local domain row'); SELECT test_reset_auth(); -- 15) Random cannot delete local domain row SELECT test_set_auth('random@example.com'); DO $$ DECLARE rc integer; BEGIN BEGIN DELETE FROM grida_www.domain WHERE www_id = current_setting('test.www_id_local')::uuid AND hostname = 'example.com'; GET DIAGNOSTICS rc = ROW_COUNT; PERFORM set_config('test.delete_random_rowcount', rc::text, true); EXCEPTION WHEN others THEN PERFORM set_config('test.delete_random_rowcount', '-1', true); END; END $$; SELECT is(current_setting('test.delete_random_rowcount'), '0', 'random cannot delete local domain row'); SELECT test_reset_auth();Update the plan to
SELECT plan(15);if adding these tests.As per coding guidelines: "Any change that alters RLS, permissions, or tenant boundaries must ship with pgTAP tests including read isolation tests and write isolation tests (insert/update/delete)".
251-277: Minor:test_reset_auth()called afteranonrole may be redundant.In tests 12 and 13, after
SET ROLE anon, you calltest_reset_auth()which internally doesRESET ROLE. This works, but the function also setsrequest.jwt.claim.subto empty which is unnecessary when coming fromanon. Consider usingRESET ROLEdirectly for clarity, or this is fine as-is since it's harmless.editor/proxy.ts (1)
23-26: Fix typos in contributor mode warning.
- Line 24: "CONTRIBUTER" → "CONTRIBUTOR" and "Backedn" → "Backend"
📝 Proposed fix
if (!process.env.NEXT_PUBLIC_SUPABASE_URL) { console.warn( - "[CONTRIBUTER MODE]: Supabase Backedn is not configured - some feature may restricted" + "[CONTRIBUTOR MODE]: Supabase Backend is not configured - some features may be restricted" ); }editor/app/(api)/private/~/[org]/[proj]/www/domains/[hostname]/_refresh.ts (2)
61-63: Unusedreqparameter in function signature.The
req: NextRequestparameter is passed but never used in the function body. Consider removing it if it's not needed for future use.📝 Proposed fix
export async function refreshDomain( - req: NextRequest, { params }: { params: Promise<Params> } ) {Note: This would require updating the callers in
refresh/route.tsandverify/route.tsas well.
121-135: Consider adding error handling for database update operations.The error handling DB update (Lines 121-135) and canonical cleanup update (Lines 208-212) don't check for errors. While these are best-effort operations, silent failures could leave the system in an inconsistent state.
🛡️ Optional: Add error logging for failed updates
await wwwClient .from("domain") .update({ status: "error", // ... }) - .eq("id", domain_row.id); + .eq("id", domain_row.id) + .then(({ error }) => { + if (error) console.error("Failed to update domain error state:", error); + });Similarly for the canonical cleanup:
- await wwwClient + const { error: canonicalErr } = await wwwClient .from("domain") .update({ canonical: false }) .eq("www_id", www.id) .neq("id", domain_row.id); + if (canonicalErr) { + console.error("Failed to clear other canonical domains:", canonicalErr); + }Also applies to: 208-212
editor/app/(api)/private/~/[org]/[proj]/www/domains/platform/canonical/route.ts (2)
15-17: Unusedreqparameter.The
req: NextRequestparameter is declared but never used in this handler. If authentication is handled elsewhere (e.g., middleware), consider removing it to clarify intent. If auth checks should be here, they're missing.♻️ If no request body/auth needed, simplify signature
export async function POST( - req: NextRequest, + _req: NextRequest, { params }: { params: Promise<Params> } ) {
36-46: Consider handling "no rows updated" scenario.The update operation succeeds even if no domains exist for this
www_id. While functionally benign (clearing zero rows is a no-op), you might want to log or return a different response if the tenant has no custom domains to clear.editor/lib/domains/index.test.ts (2)
107-112: Consider additional edge cases forgetDomainKind.The current test covers basic two-label (apex) and three-label (subdomain) cases. However, the heuristic may misclassify:
- Single-label hostnames (e.g.,
localhost) → returns"subdomain"(1 part ≠ 2)- Country-code TLDs like
example.co.uk→ returns"subdomain"(3 parts)If these are intentional limitations, consider documenting them. Otherwise, consider adding tests to capture expected behavior.
🧪 Suggested additional test cases
describe("getDomainKind", () => { test("treats two labels as apex, more as subdomain", () => { expect(getDomainKind("example.com")).toBe("apex"); expect(getDomainKind("app.example.com")).toBe("subdomain"); }); + + test("single label is classified as subdomain (edge case)", () => { + expect(getDomainKind("localhost")).toBe("subdomain"); + }); + + test("country-code TLDs are classified as subdomain (known limitation)", () => { + // example.co.uk has 3 parts, so heuristic treats it as subdomain + expect(getDomainKind("example.co.uk")).toBe("subdomain"); + }); });
14-16: Consider adding a test for simple lowercase normalization.The test "lowercases and strips trailing dot" validates both behaviors simultaneously. A separate test for just lowercasing a valid hostname (e.g.,
"EXAMPLE.COM"→"example.com") would improve clarity.editor/app/(api)/private/~/[org]/[proj]/www/domains/[hostname]/route.ts (1)
11-24: Helper functions are duplicated across route files.
isPlainObject,errorMessage, anderrorBodyappear to be utility functions that could be shared. Consider extracting them to a shared module (e.g.,@/lib/errorsor@/lib/utils) to reduce duplication.editor/app/(api)/internal/resolve-host/route.ts (2)
98-107: Consider using timing-safe comparison for token validation.The token comparison
token !== expectedis vulnerable to timing attacks. While the risk is low for internal tokens, usingcrypto.timingSafeEqualis a security best practice.🔒 Timing-safe comparison
+import { timingSafeEqual } from "crypto"; + +function safeCompare(a: string, b: string): boolean { + if (a.length !== b.length) return false; + return timingSafeEqual(Buffer.from(a), Buffer.from(b)); +} + export async function GET(req: NextRequest) { const token = req.headers.get("x-grida-internal-token"); const expected = process.env.GRIDA_INTERNAL_PROXY_TOKEN; - if (!expected || token !== expected) { + if (!expected || !token || !safeCompare(token, expected)) { return NextResponse.json( { error: "unauthorized" }, { status: 401, headers: { "cache-control": "no-store" } } ); }
13-17: Inconsistent hostname normalization logic.This
normalizeHostfunction only strips ports, whilenormalizeHostnamein@/lib/domainsperforms comprehensive validation (rejecting schemes, paths, queries, etc.). Consider reusingnormalizeHostnamefor consistency, or document why a lighter normalization is acceptable here.editor/.env.example (1)
60-62: Consider documenting token requirements.For
GRIDA_INTERNAL_PROXY_TOKEN, it would be helpful to document the expected format or minimum entropy requirements (e.g., "use a cryptographically random string of at least 32 characters").📝 Documentation suggestion
# internal (proxy -> internal api auth) # used to authenticate requests from `proxy.ts` to `/internal/resolve-host` +# generate with: openssl rand -hex 32 GRIDA_INTERNAL_PROXY_TOKEN=""editor/lib/tenant/middleware.ts (2)
173-263: Duplicated resolution logic between internal call and DB fallback.The DB resolution logic (lines 173-263) largely duplicates what's in
resolveHostCachedin the internal resolver. If both must exist for reliability, consider extracting the shared logic to a common utility to ensure consistency.
71-287: Function is complex; consider decomposition.
routeProxyRequesthandles many distinct cases (localhost, app hosts, platform hosts, custom domains, canonicalization) in a single 200+ line function. Consider breaking it into smaller, focused helper functions for each routing scenario to improve readability and testability.editor/app/(workbench)/[org]/[proj]/(console)/(resources)/domains/section-domains.tsx (1)
40-42: Reuse shared domain types to avoid drift.
DomainStatus/DomainKindalready exist in the shared domain module; duplicating them here risks divergence between UI and backend logic. Consider importing the shared types instead.♻️ Suggested change
-import React, { useMemo, useState } from "react"; +import React, { useMemo, useState } from "react"; +import type { DomainKind, DomainStatus } from "@/lib/domains"; @@ -type DomainStatus = "pending" | "active" | "error"; -type DomainKind = "apex" | "subdomain";
| **Vercel requirement** | ||
|
|
||
| - A record must point to Vercel’s anycast IP: `76.76.21.21` | ||
|
|
||
| This value is considered stable infrastructure knowledge (as long as Vercel remains the edge provider). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
Vercel custom domain apex A record anycast IP 76.76.21.21 current 2025
💡 Result:
- Yes —
76.76.21.21is (and throughout 2025 was) a Vercel “universal”/Anycast A record IP commonly used for apex/root domains (example.com). Vercel support and community guidance in March 2025 still explicitly recommended setting the apex A record to76.76.21.21. [2] - However, by mid/late 2025 Vercel’s guidance changed: for apex domains using A records, you should use the “recommended IP address provided in your project’s Domain Settings”, which is chosen from a pool of optimized Anycast IPs (so it may not always be
76.76.21.21). [3] 76.76.21.21continued to be referenced in late 2025 as the “universal apex IP” in Vercel community troubleshooting contexts. [1]
Practical takeaway (for “current” config): if you’re configuring an apex domain on Vercel, prefer the exact A-record IP shown in your Vercel project’s Domain Settings; 76.76.21.21 is the legacy/universal value and may still work in some cases, but it’s no longer the one-size-fits-all recommendation. [3]
Sources: [1] [2] [3]
Update to reflect current Vercel domain configuration guidance.
The hardcoded anycast IP 76.76.21.21 is outdated. As of mid-2025, Vercel changed guidance for apex A records: instead of using the universal anycast IP, users should use the exact A-record IP address provided in their Vercel project's Domain Settings, which is selected from an optimized pool of anycast IPs. While 76.76.21.21 may still work as a legacy fallback, document the current best practice of retrieving the project-specific IP from Domain Settings.
🤖 Prompt for AI Agents
In `@docs/wg/platform/multi-tenant-custom-domain-vercel.md` around lines 80 - 84,
Update the "**Vercel requirement**" section that currently hardcodes the anycast
IP `76.76.21.21` to instead instruct users to obtain the exact A-record IP from
their Vercel project's Domain Settings (project-specific IP selected from
Vercel's optimized anycast pool); note that `76.76.21.21` can be mentioned as a
possible legacy fallback but mark the recommended best practice as using the
project-specific IP provided in Domain Settings and adjust the explanatory
sentence accordingly.
| // Set requested domain canonical, clear others. | ||
| const { error: clear_err } = await wwwClient | ||
| .from("domain") | ||
| .update({ canonical: false }) | ||
| .eq("www_id", www.id); | ||
| if (clear_err) { | ||
| return NextResponse.json( | ||
| { error: { code: "DB_ERROR", message: clear_err.message } }, | ||
| { status: 500 } | ||
| ); | ||
| } | ||
|
|
||
| const { data: updated, error: set_err } = await wwwClient | ||
| .from("domain") | ||
| .update({ canonical: true }) | ||
| .eq("id", target.id) | ||
| .select() | ||
| .single(); | ||
| if (set_err) { | ||
| return NextResponse.json( | ||
| { error: { code: "DB_ERROR", message: set_err.message } }, | ||
| { status: 500 } | ||
| ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-atomic canonical flag update may leave inconsistent state.
The clear and set operations are performed sequentially without a transaction. If the update({ canonical: true }) fails after the update({ canonical: false }) succeeds, all domains will have canonical: false, leaving no canonical domain.
Consider wrapping these operations in a transaction or using a single conditional update to ensure atomicity.
🔧 Alternative: single atomic update using conditional logic
If Supabase/Postgres supports it, you could use a single statement or RPC:
-- Example RPC that atomically sets canonical
UPDATE domain
SET canonical = (id = $target_id)
WHERE www_id = $www_id;Or handle rollback in application code if the second update fails.
🤖 Prompt for AI Agents
In
`@editor/app/`(api)/private/~/[org]/[proj]/www/domains/[hostname]/canonical/route.ts
around lines 68 - 91, The two sequential updates using
wwwClient.from("domain").update({ canonical: false }).eq("www_id", www.id) and
wwwClient.from("domain").update({ canonical: true }).eq("id", target.id) are not
atomic and can leave no canonical domain if the second update fails; change this
to an atomic operation—either run both updates inside a database transaction (so
you rollback on failure) or replace them with a single conditional update/RPC
(e.g., a single UPDATE that sets canonical = (id = target_id) WHERE www_id =
www_id) invoked via wwwClient, and ensure you check and return any
transaction/RPC errors the same way you currently handle clear_err/set_err.
| // Remove from Vercel project | ||
| try { | ||
| await projectsRemoveProjectDomain(hostname); | ||
| } catch (e: unknown) { | ||
| return NextResponse.json( | ||
| { | ||
| error: { | ||
| code: "VERCEL_ERROR", | ||
| message: errorMessage(e) ?? "Vercel domain removal failed.", | ||
| provider: { name: "vercel", detail: errorBody(e) }, | ||
| }, | ||
| }, | ||
| { status: 502 } | ||
| ); | ||
| } | ||
|
|
||
| const { error: del_err } = await wwwClient | ||
| .from("domain") | ||
| .delete() | ||
| .eq("id", domain_row.id); | ||
| if (del_err) { | ||
| return NextResponse.json( | ||
| { error: { code: "DB_ERROR", message: del_err.message } }, | ||
| { status: 500 } | ||
| ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Operation order may leave orphaned DB records on partial failure.
The domain is removed from Vercel (lines 82-95) before being deleted from the database (lines 97-106). If the DB deletion fails, the domain record remains in the database but is no longer attached to Vercel, creating an inconsistent state that may confuse users or cause issues on retry.
Consider reversing the order (delete from DB first, then Vercel) or implementing a compensating action (re-add to Vercel on DB failure).
🤖 Prompt for AI Agents
In `@editor/app/`(api)/private/~/[org]/[proj]/www/domains/[hostname]/route.ts
around lines 81 - 106, The current sequence deletes the Vercel domain via
projectsRemoveProjectDomain before removing the DB row
(wwwClient.from("domain").delete().eq("id", domain_row.id)), which can leave an
orphaned DB record on DB failure; change the flow to delete the DB row first,
then call projectsRemoveProjectDomain, and if the Vercel removal fails attempt a
compensating rollback by re-inserting the saved domain_row via
wwwClient.from("domain").insert(domain_row) (or wrap both operations in a
transaction/atomic workflow), and return appropriate NextResponse.json errors on
each failure path.
| const key = "site-domains"; | ||
|
|
||
| const { data, isLoading, error } = useSWR<ProjectWWWMinimal>( | ||
| key, | ||
| async () => { | ||
| const { data } = await client | ||
| .from("www") | ||
| .select("id, name, project_id") | ||
| .eq("project_id", project.id) | ||
| .single() | ||
| .throwOnError(); | ||
|
|
||
| return data satisfies ProjectWWWMinimal; | ||
| } | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SWR cache key should be project-scoped to prevent cross-project data collision.
The key "site-domains" is static and doesn't include the project identifier. If a user opens domain pages for multiple projects in different tabs, they'll share the same SWR cache, potentially displaying incorrect data.
🐛 Proposed fix
- const key = "site-domains";
+ const key = `site-domains:${project.id}`;
const { data, isLoading, error } = useSWR<ProjectWWWMinimal>(
key,Also update the mutate calls in changeDomainName to use the scoped key:
- mutate(key);
- mutate("site");
+ mutate(`site-domains:${data.id}`);
+ mutate(`site:${project.id}`);Note: Verify that the "site" key used elsewhere is also project-scoped, otherwise synchronization between tabs may not work as intended.
🤖 Prompt for AI Agents
In `@editor/app/`(workbench)/[org]/[proj]/(console)/(resources)/domains/page.tsx
around lines 24 - 38, The SWR cache key "site-domains" is global and must be
made project-scoped to avoid cross-project collisions: change the key used in
useSWR (the variable key) to include the project.id (e.g.
`${project.id}:site-domains`) and update any mutate calls inside
changeDomainName that reference the old "site-domains" key to use the new scoped
key; also review other related keys such as "site" to ensure they are similarly
namespaced to project.id so tab synchronization stays correct.
| function RenamePlatformDomainDialog({ | ||
| open, | ||
| onOpenChange, | ||
| defaultName, | ||
| onSubmit, | ||
| }: { | ||
| open: boolean; | ||
| onOpenChange: (open: boolean) => void; | ||
| defaultName: string; | ||
| onSubmit: (name: string) => Promise<boolean>; | ||
| }) { | ||
| const [name, setName] = useState(defaultName); | ||
| const [busy, setBusy] = useState(false); | ||
| const [error, setError] = useState<string | null>(null); | ||
|
|
||
| const dirty = useMemo(() => name !== defaultName, [name, defaultName]); | ||
|
|
||
| const onSubmitHandler = async () => { | ||
| setBusy(true); | ||
| const ok = await onSubmit(name); | ||
| setBusy(false); | ||
| if (ok) { | ||
| toast.success("Domain updated"); | ||
| onOpenChange(false); | ||
| } else { | ||
| setError( | ||
| "This domain is either already taken or not allowed. Please try a different name using only letters, numbers, or dashes." | ||
| ); | ||
| } | ||
| }; | ||
|
|
||
| return ( | ||
| <Dialog open={open} onOpenChange={onOpenChange}> | ||
| <DialogContent> | ||
| <DialogHeader> | ||
| <DialogTitle>Edit platform domain</DialogTitle> | ||
| <DialogDescription> | ||
| This changes your {defaultName}.grida.site hostname. | ||
| </DialogDescription> | ||
| </DialogHeader> | ||
| <Field className="py-2"> | ||
| <FieldLabel className="sr-only">Domain name</FieldLabel> | ||
| <div className="flex h-9 items-center border rounded-md px-3 py-1 focus-within:ring-2 focus-within:ring-ring focus-within:ring-offset-2 bg-muted"> | ||
| <Input | ||
| className="border-none bg-transparent focus-visible:ring-0 focus-visible:ring-offset-0 px-0 shadow-none" | ||
| placeholder="your-domain" | ||
| disabled={busy} | ||
| value={name} | ||
| onChange={(e) => { | ||
| setName(e.target.value); | ||
| setError(null); | ||
| }} | ||
| /> | ||
| <span className="ml-2 text-muted-foreground text-sm"> | ||
| .grida.site | ||
| </span> | ||
| </div> | ||
| <FieldDescription | ||
| data-error={!!error} | ||
| className="text-xs text-muted-foreground data-[error=true]:text-destructive" | ||
| > | ||
| {error | ||
| ? error | ||
| : "lowercase letters, numbers, and dashes are allowed"} | ||
| </FieldDescription> | ||
| </Field> | ||
| <DialogFooter> | ||
| <DialogClose asChild> | ||
| <Button variant="ghost" size="sm"> | ||
| Cancel | ||
| </Button> | ||
| </DialogClose> | ||
| <Button onClick={onSubmitHandler} disabled={!dirty || busy} size="sm"> | ||
| Save changes | ||
| </Button> | ||
| </DialogFooter> | ||
| </DialogContent> | ||
| </Dialog> | ||
| ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reset dialog state on reopen/defaultName changes.
useState(defaultName) only runs once, so reopening the dialog after a rename can show stale input and lingering error text. Reset the state when open or defaultName changes.
🛠️ Suggested change
-import React, { useMemo, useState } from "react";
+import React, { useEffect, useMemo, useState } from "react";
@@
const [name, setName] = useState(defaultName);
const [busy, setBusy] = useState(false);
const [error, setError] = useState<string | null>(null);
+ useEffect(() => {
+ if (open) {
+ setName(defaultName);
+ setError(null);
+ }
+ }, [open, defaultName]);🤖 Prompt for AI Agents
In
`@editor/app/`(workbench)/[org]/[proj]/(console)/(resources)/domains/section-domains.tsx
around lines 633 - 712, The dialog's local state (name, error, busy) is
initialized once with useState(defaultName) and not reset when the dialog
reopens or defaultName changes; update RenamePlatformDomainDialog by adding an
effect (useEffect) that runs when open or defaultName changes and resets
setName(defaultName), setError(null) and setBusy(false) (or appropriate initial
values) so the input and error message are fresh on each open/defaultName
update; keep existing handlers (onSubmitHandler, Input onChange) unchanged.
| export function getDomainKind(hostname: string): DomainKind { | ||
| // Heuristic: if it has exactly one dot, it is an apex domain (example.com). | ||
| // Any additional label makes it a subdomain (app.example.com). | ||
| const parts = hostname.split(".").filter(Boolean); | ||
| return parts.length === 2 ? "apex" : "subdomain"; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle multi‑part TLDs in domain kind detection.
The current dot-count heuristic misclassifies apex domains like example.co.uk as subdomains, which can break DNS instructions and verification flow for common ccTLDs. Consider using a Public Suffix List–based parser to determine eTLD+1 accurately.
🧭 Example using a PSL parser (requires dependency)
+import { parse } from "tldts";
@@
export function getDomainKind(hostname: string): DomainKind {
- const parts = hostname.split(".").filter(Boolean);
- return parts.length === 2 ? "apex" : "subdomain";
+ const parsed = parse(hostname);
+ if (!parsed.domain) return "subdomain";
+ return parsed.subdomain ? "subdomain" : "apex";
}🤖 Prompt for AI Agents
In `@editor/lib/domains/index.ts` around lines 44 - 49, The current
getDomainKind(hostname: string) uses a dot-count heuristic that misclassifies
multi‑part TLDs; replace it with a Public Suffix List–based check (e.g., using a
PSL parser like the "psl" or "tldts" package) to compute the eTLD+1 for the
hostname and then return "apex" when the eTLD+1 equals the original hostname,
otherwise "subdomain"; update references to DomainKind as needed and add/adjust
unit tests to cover examples like example.co.uk and app.example.co.uk.
| if (internalToken && deployHost) { | ||
| try { | ||
| const u = new URL("/internal/resolve-host", deployHost); | ||
| u.searchParams.set("host", hostname); | ||
| const r = await fetch(u.toString(), { | ||
| method: "GET", | ||
| headers: { "x-grida-internal-token": internalToken }, | ||
| }); | ||
|
|
||
| if (r.ok) { | ||
| const json = (await r.json().catch(() => null)) as { | ||
| data: { www_name: string; canonical_host: string } | null; | ||
| } | null; | ||
| const data = json?.data ?? null; | ||
| if (data?.www_name) { | ||
| if (data.canonical_host && data.canonical_host !== hostname) { | ||
| const redirectTo = new URL( | ||
| req.nextUrl.pathname + req.nextUrl.search, | ||
| `https://${data.canonical_host}` | ||
| ); | ||
| return NextResponse.redirect(redirectTo, { status: 301 }); | ||
| } | ||
|
|
||
| const rewritten = req.nextUrl.clone(); | ||
| rewritten.pathname = `/~/${data.www_name}${req.nextUrl.pathname}`; | ||
| return NextResponse.rewrite(rewritten, { | ||
| request: { headers: req.headers }, | ||
| status: res.status, | ||
| }); | ||
| } | ||
| } | ||
| } catch { | ||
| // fall back to DB resolution below | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add timeout to internal resolver fetch.
The fetch call to the internal resolver has no timeout. If the internal endpoint is slow or unresponsive, this could block the middleware and degrade user experience. Consider using AbortController with a timeout.
⏱️ Add fetch timeout
+ const controller = new AbortController();
+ const timeoutId = setTimeout(() => controller.abort(), 3000);
+
try {
const u = new URL("/internal/resolve-host", deployHost);
u.searchParams.set("host", hostname);
const r = await fetch(u.toString(), {
method: "GET",
headers: { "x-grida-internal-token": internalToken },
+ signal: controller.signal,
});
+ clearTimeout(timeoutId);
if (r.ok) {🤖 Prompt for AI Agents
In `@editor/lib/tenant/middleware.ts` around lines 137 - 171, The internal
resolver fetch (triggered when internalToken && deployHost, building new
URL("/internal/resolve-host", deployHost)) needs a timeout: create an
AbortController, start a timeout (e.g. 2s) that calls controller.abort(), pass
controller.signal into the fetch options (headers + signal), and clear the
timeout after the fetch completes or fails; ensure the catch block handles
aborts gracefully (same fallback to DB resolution). This change should be
applied around the existing fetch call so the fetch is cancelled on timeout and
resources are cleaned up.
day-316-grida-sites-byod.mp4
Release Notes
New Features
Documentation