-
Notifications
You must be signed in to change notification settings - Fork 1
feat(api.iam): auto tenant root workspace creation #210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…lembic - Add SQLAlchemy 2.0 with asyncpg for async database operations - Add Alembic for schema migrations - Add python-ulid for ULID support instead of UUID - Create read/write engine separation with connection pooling - Create FastAPI dependency injection for database sessions - Create SQLAlchemy declarative base with timestamp mixin - Initialize Alembic with async migration support - Create initial migration for teams table (ULID primary key) - Add comprehensive unit tests for engines and dependencies - Configure Alembic to use settings module for database URL - Enable ruff post-write hook for migration formatting Refs: AIHCM-121
- Add authzed library for SpiceDB integration - Add python-ulid for ULID support - Create ResourceType, RelationType, Permission enums (using Group not Team) - Create AuthorizationProvider protocol for swappable implementations - Implement SpiceDBClient with async methods for relationships and permissions - Create SpiceDB schema (.zed) with Tenant→Workspace→Group hierarchy - Create AuthorizationProbe for domain-oriented observability - Move ObservationContext to shared_kernel (fix architectural boundary) - Add 35 unit tests for types and probes - All 410 tests passing Refs: AIHCM-122
Resolved conflicts in authorization files by accepting remote changes: - shared_kernel/authorization/types.py (docstring fix) - shared_kernel/authorization/spicedb/client.py (_parse_reference helper)
Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
…dd route documentation Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
…th TDD Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Fix workspace definition in schema.zed and ConfigMap to match the relationships actually created by the IAM outbox translator. Schema changes (workspace definition): - Add `relation tenant: tenant` for organizational ownership - Change `relation parent: tenant` to `relation parent: workspace` for hierarchy - Rename `owner` to `admin` for consistency with tenant/group definitions - Rename `permission delete` to `permission manage` for consistency - Add Phase 3 comments for member/permission usage ConfigMap changes (full sync with schema.zed): - Apply all workspace definition fixes above - Add missing `relation member: user` to tenant definition - Fix tenant `permission view = admin` to `permission view = admin + member` - Add missing `permission administrate = admin` to tenant definition - Add missing `api_key` definition (was in schema.zed but not ConfigMap) - Add future resource type comments Inconsistencies found and documented: 1. Schema had `relation parent: tenant` but translator writes `workspace#tenant@tenant` (relation name 'tenant') and `workspace#parent@workspace` (parent type 'workspace') 2. ConfigMap was missing tenant `member` relation, `administrate` permission, and entire `api_key` definition 3. RelationType.WORKSPACE enum exists but is unused by any translator 4. Permission.DELETE enum value corresponds to removed `permission delete` in workspace; may need cleanup in Phase 3 5. Schema `owner` relation on workspace renamed to `admin` to align with tenant and group naming conventions All 970 unit tests pass (3 pre-existing SSL failures unrelated). Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Remove RelationType.WORKSPACE and Permission.DELETE which have no usage in the codebase. Neither value corresponds to any relation or permission in the current SpiceDB schema. They can be re-added when future resource types (knowledge_graph, data_source) are implemented. Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
…tion Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
…nshift-hyperfleet/kartograph into jsell/feat/AIHCM-146-workspace-authz
Integrates workspace lifecycle with tenant lifecycle (AIHCM-147): - TenantService constructor now accepts workspace_repository - create_tenant auto-provisions a root workspace using settings default_workspace_name, falling back to tenant name - delete_tenant cascades to workspaces before groups/API keys, ensuring WorkspaceDeleted events are emitted for SpiceDB cleanup - TenantServiceProbe updated with workspaces_count parameter - Dependency injection wired up with workspace repository - 8 new unit tests covering create and delete workspace integration Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
The clean_iam_data fixture was missing workspace deletion, causing FK constraint violations when tests created workspaces. Adds workspace cleanup in correct order (children then roots) before tenant deletion, and introduces a TestCleanupProbe following Domain-Oriented Observability to replace silent exception swallowing with structured domain events. Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
WalkthroughThis PR transitions the authorization model from owner-based to admin-based permissions for workspaces, integrates workspace management into the tenant lifecycle, and adds bootstrap logic to ensure a default workspace on tenant creation. Changes include: updating SpiceDB schema definitions to replace owner relations with admin relations and adding tenant/parent workspace relations; adding workspace cascade deletion to TenantService prior to groups and API keys; introducing TenantBootstrapService for idempotent default tenant and workspace provisioning; modifying observability probes to track workspace counts and bootstrap events; removing WORKSPACE and DELETE enum members; and enhancing integration and unit test coverage with new fixtures and test scenarios for workspace API endpoints and cleanup procedures. Sequence Diagram(s)sequenceDiagram
participant App as Application Startup
participant BS as TenantBootstrapService
participant TR as TenantRepository
participant WR as WorkspaceRepository
participant DB as Database
participant Probe as StartupProbe
App->>BS: ensure_default_tenant_with_workspace(tenant_name, workspace_name)
BS->>DB: begin transaction
alt Tenant does not exist
BS->>BS: _create_tenant_with_race_handling()
BS->>TR: save(tenant)
TR->>DB: INSERT INTO tenants
DB-->>TR: tenant_id
TR-->>BS: Tenant
Probe->>Probe: default_tenant_bootstrapped()
else Tenant exists (race condition)
BS->>Probe: default_tenant_already_exists()
end
alt Root workspace does not exist
BS->>BS: _ensure_root_workspace(tenant, workspace_name)
BS->>WR: save(workspace)
WR->>DB: INSERT INTO workspaces
DB-->>WR: workspace_id
WR-->>BS: Workspace
Probe->>Probe: default_workspace_bootstrapped()
else Root workspace exists
BS->>Probe: default_workspace_already_exists()
end
BS->>DB: commit transaction
DB-->>BS: success
BS-->>App: Tenant with root workspace
sequenceDiagram
participant Service as TenantService
participant SessionMgr as Session Manager
participant TR as TenantRepository
participant WR as WorkspaceRepository
participant GR as GroupRepository
participant KR as ApiKeyRepository
participant DB as Database
participant Probe as TenantServiceProbe
Service->>Service: delete_tenant(tenant_id)
Service->>SessionMgr: begin transaction
Service->>WR: get_by_tenant(tenant_id)
WR->>DB: SELECT FROM workspaces
DB-->>WR: workspaces[]
WR-->>Service: workspaces[]
Probe->>Probe: tenant_cascade_deletion_started(tenant_id, workspaces_count)
loop For each workspace (depth-first by hierarchy)
Service->>WR: delete(workspace_id)
WR->>DB: DELETE FROM workspaces
end
Service->>GR: get_by_tenant(tenant_id)
GR->>DB: SELECT FROM groups
loop For each group
Service->>GR: delete(group_id)
GR->>DB: DELETE FROM groups
end
Service->>KR: get_by_tenant(tenant_id)
KR->>DB: SELECT FROM api_keys
loop For each API key
Service->>KR: delete(api_key_id)
KR->>DB: DELETE FROM api_keys
end
Service->>TR: delete(tenant_id)
TR->>DB: DELETE FROM tenants
Service->>SessionMgr: commit
DB-->>Service: success
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 🧹 Recent nitpick comments
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Comment |
…workspace creation The walking skeleton bootstrap in main.py was creating the default tenant directly via domain aggregates, bypassing TenantService which now auto-creates root workspaces. This resulted in the default tenant having no root workspace. Introduces TenantBootstrapService with minimal dependencies (no authz, no group/API key repos) specifically for startup bootstrap operations. Updates main.py to use this service, ensuring the default tenant gets its root workspace just like tenants created via the API. Also extends StartupProbe with workspace bootstrap observability methods. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/api/iam/application/services/tenant_service.py`:
- Around line 364-367: The loop that deletes workspaces in TenantService
currently iterates arbitrary order and can delete parents before children
causing FK errors; before calling workspace.mark_for_deletion() and
self._workspace_repository.delete(workspace) on items returned from
list_by_tenant(), topologically sort or compute depth for each workspace using
parent_workspace_id (e.g., build a parent->children map and perform a post-order
traversal or sort by descending depth so leaves come first) and then iterate the
sorted list to mark and delete; alternatively, if acceptable, change the DB FK
to ondelete="CASCADE" but prefer implementing the depth/topological ordering
inside the method that handles workspaces prior to deletion.
In `@src/api/tests/integration/iam/conftest.py`:
- Around line 136-158: The mypy error is due to using the generic Result return
type while accessing result.rowcount; fix by annotating the variable returned
from async_session.execute as sqlalchemy.engine.CursorResult so mypy knows
rowcount exists (import CursorResult and write e.g. result: CursorResult = await
async_session.execute(...)) for the three DELETE calls (the assignments that
feed probe.table_cleaned) or alternatively use typing.cast to cast the execute
return to CursorResult before accessing rowcount; reference
async_session.execute, Result/CursorResult, result.rowcount, and
probe.table_cleaned when making the change.
SQLAlchemy 2.0 autobegins a transaction on the first query. In create_workspace(), queries were executed before session.begin() was called, causing "A transaction is already begun on this Session" error. Restructured create_workspace() to wrap all operations (validation queries + save) inside a single session.begin() block, matching the pattern used in TenantService.create_tenant(). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The parent_workspace_id foreign key should prevent deleting a parent workspace while children exist. If the database has a different constraint (e.g., SET NULL from an earlier state), this migration fixes it. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ldren SQLAlchemy by default sets child foreign keys to NULL before deleting a parent, which bypasses database RESTRICT constraints. Adding passive_deletes=True to the child_workspaces relationship lets the database handle FK constraint enforcement, correctly blocking deletion of workspaces that have children. Also adds integration tests for workspace deletion behavior. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@src/api/iam/application/services/tenant_bootstrap_service.py`:
- Around line 146-170: Wrap the call to
self._workspace_repository.save(workspace) in an IntegrityError handler: after
Workspace.create_root(...) and before emitting default_workspace_bootstrapped,
catch the DB integrity exception raised by
_workspace_repository.save(workspace), then re-query via
self._workspace_repository.get_root_workspace(tenant.id); if an
existing_workspace is returned call
self._probe.default_workspace_already_exists(...) with the existing workspace
details and return, otherwise re-raise the exception; keep existing successful
path that calls self._probe.default_workspace_bootstrapped(...) when save
succeeds.
- Around line 115-133: The save of the new Tenant must be isolated in a nested
transaction/savepoint so a DuplicateTenantNameError (from IntegrityError)
doesn't mark the outer transaction for rollback and cause PendingRollbackError
on the subsequent get_by_name(); modify the tenant creation block in
TenantBootstrapService (where Tenant.create and _tenant_repository.save are
called) to begin a nested transaction/savepoint (e.g., session.begin_nested() or
repository-provided savepoint context) around the save, commit or rollback only
that savepoint on success/error, catch DuplicateTenantNameError to rollback the
savepoint and then safely call _tenant_repository.get_by_name to re-query and
emit the same _probe.default_tenant_already_exists events (use the same
exception handling and probe calls as before).
In `@src/api/tests/integration/iam/test_workspace_api.py`:
- Around line 142-156: The test_cannot_delete_root_workspace reads resp.json()
from the GET to "/iam/workspaces" without checking the response code; add an
explicit assertion that the GET response status is 200 (e.g., assert
resp.status_code == 200) immediately after the async_client.get call before
calling resp.json(), so the test fails fast if the list call fails.
- Around line 117-140: In test_can_delete_workspace_without_children, the
initial GET to async_client.get("/iam/workspaces") doesn't assert the response
status before accessing resp.json(), so failures surface as JSON-shape errors;
add an explicit assertion that resp.status_code == 200 (or use
resp.is_successful equivalent) immediately after the GET call to validate the
request succeeded before using resp.json(), and keep this check near the
async_client.get call and before locating the root workspace variable.
🧹 Nitpick comments (1)
src/api/infrastructure/migrations/versions/fdd9505854c5_fix_workspace_parent_fk_to_use_restrict.py (1)
49-52: Consider making the no-op downgrade explicit or implementing proper reversal.A silent no-op creates a mismatch between Alembic's state (migration reverted) and the actual database (RESTRICT still in place). If someone later re-runs the upgrade,
drop_constraintwill fail because the constraint still exists with the same name.Consider either:
- Implementing the actual reversal to the previous behavior (even if undesirable), or
- Raising an error to make it explicit that downgrade is unsupported.
Option 2: Make unsupported downgrade explicit
def downgrade() -> None: - """Revert to previous state (no change needed, RESTRICT is correct).""" - # No-op: we don't want to revert to a broken state - pass + """Downgrade is not supported for this migration. + + RESTRICT is the correct behavior and reverting would restore a broken state. + """ + raise NotImplementedError( + "Downgrade not supported: reverting to non-RESTRICT behavior is not allowed" + )
- TenantService: Sort workspaces by depth before deletion (children first) to avoid FK constraint violations - TenantBootstrapService: Use savepoints (begin_nested) for tenant and workspace creation to avoid PendingRollbackError on race conditions - conftest.py: Use cast(CursorResult, ...) for DELETE execute results to satisfy mypy rowcount access - test_workspace_api.py: Add status code assertions before accessing resp.json() in deletion tests - Migration downgrade: Raise NotImplementedError instead of no-op pass Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The migration fdd9505854c5 was unnecessary - the original workspaces table migration (205809969bf4) already specified ON DELETE RESTRICT for the parent_workspace_id FK. The actual fix for parent deletion was adding passive_deletes=True to the SQLAlchemy relationship. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary by CodeRabbit
Release Notes