Skip to content

Conversation

@jsell-rh
Copy link
Collaborator

@jsell-rh jsell-rh commented Feb 10, 2026

Summary by CodeRabbit

Release Notes

  • New Features
    • Introduced hierarchical workspace structures with parent-child relationships for better organizational management.
    • Added API key management with tenant-scoped access controls for secure authentication.
    • Implemented automatic default tenant and root workspace provisioning on application startup.
    • Updated permission model: admin-based access control replacing previous ownership model.
    • Added workspace deletion constraints to protect parent workspaces with children.

…lembic

- Add SQLAlchemy 2.0 with asyncpg for async database operations
- Add Alembic for schema migrations
- Add python-ulid for ULID support instead of UUID
- Create read/write engine separation with connection pooling
- Create FastAPI dependency injection for database sessions
- Create SQLAlchemy declarative base with timestamp mixin
- Initialize Alembic with async migration support
- Create initial migration for teams table (ULID primary key)
- Add comprehensive unit tests for engines and dependencies
- Configure Alembic to use settings module for database URL
- Enable ruff post-write hook for migration formatting

Refs: AIHCM-121
- Add authzed library for SpiceDB integration
- Add python-ulid for ULID support
- Create ResourceType, RelationType, Permission enums (using Group not Team)
- Create AuthorizationProvider protocol for swappable implementations
- Implement SpiceDBClient with async methods for relationships and permissions
- Create SpiceDB schema (.zed) with Tenant→Workspace→Group hierarchy
- Create AuthorizationProbe for domain-oriented observability
- Move ObservationContext to shared_kernel (fix architectural boundary)
- Add 35 unit tests for types and probes
- All 410 tests passing

Refs: AIHCM-122
Resolved conflicts in authorization files by accepting remote changes:
- shared_kernel/authorization/types.py (docstring fix)
- shared_kernel/authorization/spicedb/client.py (_parse_reference helper)
jsell-rh and others added 15 commits February 9, 2026 15:20
Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
…dd route documentation

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
…th TDD

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Fix workspace definition in schema.zed and ConfigMap to match the
relationships actually created by the IAM outbox translator.

Schema changes (workspace definition):
- Add `relation tenant: tenant` for organizational ownership
- Change `relation parent: tenant` to `relation parent: workspace` for hierarchy
- Rename `owner` to `admin` for consistency with tenant/group definitions
- Rename `permission delete` to `permission manage` for consistency
- Add Phase 3 comments for member/permission usage

ConfigMap changes (full sync with schema.zed):
- Apply all workspace definition fixes above
- Add missing `relation member: user` to tenant definition
- Fix tenant `permission view = admin` to `permission view = admin + member`
- Add missing `permission administrate = admin` to tenant definition
- Add missing `api_key` definition (was in schema.zed but not ConfigMap)
- Add future resource type comments

Inconsistencies found and documented:
1. Schema had `relation parent: tenant` but translator writes
   `workspace#tenant@tenant` (relation name 'tenant') and
   `workspace#parent@workspace` (parent type 'workspace')
2. ConfigMap was missing tenant `member` relation, `administrate`
   permission, and entire `api_key` definition
3. RelationType.WORKSPACE enum exists but is unused by any translator
4. Permission.DELETE enum value corresponds to removed `permission delete`
   in workspace; may need cleanup in Phase 3
5. Schema `owner` relation on workspace renamed to `admin` to align with
   tenant and group naming conventions

All 970 unit tests pass (3 pre-existing SSL failures unrelated).

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Remove RelationType.WORKSPACE and Permission.DELETE which have no usage
in the codebase. Neither value corresponds to any relation or permission
in the current SpiceDB schema. They can be re-added when future resource
types (knowledge_graph, data_source) are implemented.

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
…tion

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
…nshift-hyperfleet/kartograph into jsell/feat/AIHCM-146-workspace-authz
Integrates workspace lifecycle with tenant lifecycle (AIHCM-147):

- TenantService constructor now accepts workspace_repository
- create_tenant auto-provisions a root workspace using settings
  default_workspace_name, falling back to tenant name
- delete_tenant cascades to workspaces before groups/API keys,
  ensuring WorkspaceDeleted events are emitted for SpiceDB cleanup
- TenantServiceProbe updated with workspaces_count parameter
- Dependency injection wired up with workspace repository
- 8 new unit tests covering create and delete workspace integration

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
The clean_iam_data fixture was missing workspace deletion, causing FK
constraint violations when tests created workspaces. Adds workspace
cleanup in correct order (children then roots) before tenant deletion,
and introduces a TestCleanupProbe following Domain-Oriented Observability
to replace silent exception swallowing with structured domain events.

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
@jsell-rh jsell-rh self-assigned this Feb 10, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 10, 2026

Walkthrough

This PR transitions the authorization model from owner-based to admin-based permissions for workspaces, integrates workspace management into the tenant lifecycle, and adds bootstrap logic to ensure a default workspace on tenant creation. Changes include: updating SpiceDB schema definitions to replace owner relations with admin relations and adding tenant/parent workspace relations; adding workspace cascade deletion to TenantService prior to groups and API keys; introducing TenantBootstrapService for idempotent default tenant and workspace provisioning; modifying observability probes to track workspace counts and bootstrap events; removing WORKSPACE and DELETE enum members; and enhancing integration and unit test coverage with new fixtures and test scenarios for workspace API endpoints and cleanup procedures.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application Startup
    participant BS as TenantBootstrapService
    participant TR as TenantRepository
    participant WR as WorkspaceRepository
    participant DB as Database
    participant Probe as StartupProbe
    
    App->>BS: ensure_default_tenant_with_workspace(tenant_name, workspace_name)
    BS->>DB: begin transaction
    
    alt Tenant does not exist
        BS->>BS: _create_tenant_with_race_handling()
        BS->>TR: save(tenant)
        TR->>DB: INSERT INTO tenants
        DB-->>TR: tenant_id
        TR-->>BS: Tenant
        Probe->>Probe: default_tenant_bootstrapped()
    else Tenant exists (race condition)
        BS->>Probe: default_tenant_already_exists()
    end
    
    alt Root workspace does not exist
        BS->>BS: _ensure_root_workspace(tenant, workspace_name)
        BS->>WR: save(workspace)
        WR->>DB: INSERT INTO workspaces
        DB-->>WR: workspace_id
        WR-->>BS: Workspace
        Probe->>Probe: default_workspace_bootstrapped()
    else Root workspace exists
        BS->>Probe: default_workspace_already_exists()
    end
    
    BS->>DB: commit transaction
    DB-->>BS: success
    BS-->>App: Tenant with root workspace
Loading
sequenceDiagram
    participant Service as TenantService
    participant SessionMgr as Session Manager
    participant TR as TenantRepository
    participant WR as WorkspaceRepository
    participant GR as GroupRepository
    participant KR as ApiKeyRepository
    participant DB as Database
    participant Probe as TenantServiceProbe
    
    Service->>Service: delete_tenant(tenant_id)
    Service->>SessionMgr: begin transaction
    
    Service->>WR: get_by_tenant(tenant_id)
    WR->>DB: SELECT FROM workspaces
    DB-->>WR: workspaces[]
    WR-->>Service: workspaces[]
    
    Probe->>Probe: tenant_cascade_deletion_started(tenant_id, workspaces_count)
    
    loop For each workspace (depth-first by hierarchy)
        Service->>WR: delete(workspace_id)
        WR->>DB: DELETE FROM workspaces
    end
    
    Service->>GR: get_by_tenant(tenant_id)
    GR->>DB: SELECT FROM groups
    loop For each group
        Service->>GR: delete(group_id)
        GR->>DB: DELETE FROM groups
    end
    
    Service->>KR: get_by_tenant(tenant_id)
    KR->>DB: SELECT FROM api_keys
    loop For each API key
        Service->>KR: delete(api_key_id)
        KR->>DB: DELETE FROM api_keys
    end
    
    Service->>TR: delete(tenant_id)
    TR->>DB: DELETE FROM tenants
    
    Service->>SessionMgr: commit
    DB-->>Service: success
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested labels

tracer-bullet

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(api.iam): auto tenant root workspace creation' is clear, concise, and directly describes the main feature added—automatic root workspace creation for tenants.
Docstring Coverage ✅ Passed Docstring coverage is 97.37% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch jsell/feat/AIHCM-147-auto-workspace

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments
src/api/tests/integration/iam/conftest.py (2)

116-175: Hardcoded tables_cleaned count may be inaccurate.

The cleanup() function deletes from 6 tables (outbox, api_keys, groups, users, workspaces children, workspaces roots, tenants), but the count distinguishes "workspaces (children)" and "workspaces (roots)" as separate entries despite being the same table. Depending on how you want to count this, either adjust tables_cleaned=7 to match the 7 table_cleaned calls, or clarify that this represents unique tables (6).

Currently, you have 7 probe.table_cleaned(...) calls but tables_cleaned=7 happens to match this. The logic is fine as-is if counting probe calls, but consider adding a brief comment to clarify the counting rationale for future maintainers.


172-175: Consider tracking which table failed for better diagnostics.

The exception handler reports table_name="unknown", which loses diagnostic information. A minor improvement would track the current operation.

💡 Optional: Track current table for error reporting
     async def cleanup() -> None:
         """Perform cleanup with proper FK constraint ordering."""
         probe.cleanup_started(default_tenant_name=default_tenant_name)
+        current_table = "unknown"

         try:
             # Delete in FK-respecting order
+            current_table = "outbox"
             await async_session.execute(text("DELETE FROM outbox"))
             probe.table_cleaned("outbox")
             # ... similar for other tables
         except Exception as e:
-            probe.cleanup_failed(table_name="unknown", error=str(e))
+            probe.cleanup_failed(table_name=current_table, error=str(e))
             await async_session.rollback()
             raise
src/api/tests/unit/iam/application/test_tenant_bootstrap_service.py (1)

174-214: Consider adding a test for workspace creation race condition.

The tests cover tenant creation race conditions well, but there's no test for the IntegrityError handling in _ensure_root_workspace when two processes concurrently create the same root workspace. This would ensure the workspace race condition path (lines 185-198 in the service) is covered.

💡 Suggested test for workspace race condition
`@pytest.mark.asyncio`
async def test_handles_race_condition_on_workspace_creation(
    self, bootstrap_service, mock_tenant_repo, mock_workspace_repo, mock_probe
):
    """Test that race conditions during concurrent workspace creation are handled."""
    from datetime import UTC, datetime
    from sqlalchemy.exc import IntegrityError

    existing_tenant = Tenant(id=TenantId.generate(), name="default")
    concurrent_workspace = Workspace(
        id=WorkspaceId.generate(),
        tenant_id=existing_tenant.id,
        name="Root",
        parent_workspace_id=None,
        is_root=True,
        created_at=datetime.now(UTC),
        updated_at=datetime.now(UTC),
    )

    mock_tenant_repo.get_by_name = AsyncMock(return_value=existing_tenant)
    # First call returns None, second call (after IntegrityError) returns the concurrent workspace
    mock_workspace_repo.get_root_workspace = AsyncMock(
        side_effect=[None, concurrent_workspace]
    )
    mock_workspace_repo.save = AsyncMock(side_effect=IntegrityError("", "", Exception()))

    tenant = await bootstrap_service.ensure_default_tenant_with_workspace(
        tenant_name="default",
        workspace_name="Root",
    )

    assert tenant.id == existing_tenant.id
    assert mock_workspace_repo.get_root_workspace.call_count == 2
    mock_probe.default_workspace_already_exists.assert_called_once()

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Comment @coderabbitai help to get the list of available commands and usage tips.

…workspace creation

The walking skeleton bootstrap in main.py was creating the default tenant
directly via domain aggregates, bypassing TenantService which now auto-creates
root workspaces. This resulted in the default tenant having no root workspace.

Introduces TenantBootstrapService with minimal dependencies (no authz, no
group/API key repos) specifically for startup bootstrap operations. Updates
main.py to use this service, ensuring the default tenant gets its root
workspace just like tenants created via the API.

Also extends StartupProbe with workspace bootstrap observability methods.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/api/iam/application/services/tenant_service.py`:
- Around line 364-367: The loop that deletes workspaces in TenantService
currently iterates arbitrary order and can delete parents before children
causing FK errors; before calling workspace.mark_for_deletion() and
self._workspace_repository.delete(workspace) on items returned from
list_by_tenant(), topologically sort or compute depth for each workspace using
parent_workspace_id (e.g., build a parent->children map and perform a post-order
traversal or sort by descending depth so leaves come first) and then iterate the
sorted list to mark and delete; alternatively, if acceptable, change the DB FK
to ondelete="CASCADE" but prefer implementing the depth/topological ordering
inside the method that handles workspaces prior to deletion.

In `@src/api/tests/integration/iam/conftest.py`:
- Around line 136-158: The mypy error is due to using the generic Result return
type while accessing result.rowcount; fix by annotating the variable returned
from async_session.execute as sqlalchemy.engine.CursorResult so mypy knows
rowcount exists (import CursorResult and write e.g. result: CursorResult = await
async_session.execute(...)) for the three DELETE calls (the assignments that
feed probe.table_cleaned) or alternatively use typing.cast to cast the execute
return to CursorResult before accessing rowcount; reference
async_session.execute, Result/CursorResult, result.rowcount, and
probe.table_cleaned when making the change.

jsell-rh and others added 3 commits February 10, 2026 10:51
SQLAlchemy 2.0 autobegins a transaction on the first query. In
create_workspace(), queries were executed before session.begin() was
called, causing "A transaction is already begun on this Session" error.

Restructured create_workspace() to wrap all operations (validation
queries + save) inside a single session.begin() block, matching the
pattern used in TenantService.create_tenant().

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The parent_workspace_id foreign key should prevent deleting a parent
workspace while children exist. If the database has a different constraint
(e.g., SET NULL from an earlier state), this migration fixes it.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ldren

SQLAlchemy by default sets child foreign keys to NULL before deleting
a parent, which bypasses database RESTRICT constraints. Adding
passive_deletes=True to the child_workspaces relationship lets the
database handle FK constraint enforcement, correctly blocking deletion
of workspaces that have children.

Also adds integration tests for workspace deletion behavior.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@src/api/iam/application/services/tenant_bootstrap_service.py`:
- Around line 146-170: Wrap the call to
self._workspace_repository.save(workspace) in an IntegrityError handler: after
Workspace.create_root(...) and before emitting default_workspace_bootstrapped,
catch the DB integrity exception raised by
_workspace_repository.save(workspace), then re-query via
self._workspace_repository.get_root_workspace(tenant.id); if an
existing_workspace is returned call
self._probe.default_workspace_already_exists(...) with the existing workspace
details and return, otherwise re-raise the exception; keep existing successful
path that calls self._probe.default_workspace_bootstrapped(...) when save
succeeds.
- Around line 115-133: The save of the new Tenant must be isolated in a nested
transaction/savepoint so a DuplicateTenantNameError (from IntegrityError)
doesn't mark the outer transaction for rollback and cause PendingRollbackError
on the subsequent get_by_name(); modify the tenant creation block in
TenantBootstrapService (where Tenant.create and _tenant_repository.save are
called) to begin a nested transaction/savepoint (e.g., session.begin_nested() or
repository-provided savepoint context) around the save, commit or rollback only
that savepoint on success/error, catch DuplicateTenantNameError to rollback the
savepoint and then safely call _tenant_repository.get_by_name to re-query and
emit the same _probe.default_tenant_already_exists events (use the same
exception handling and probe calls as before).

In `@src/api/tests/integration/iam/test_workspace_api.py`:
- Around line 142-156: The test_cannot_delete_root_workspace reads resp.json()
from the GET to "/iam/workspaces" without checking the response code; add an
explicit assertion that the GET response status is 200 (e.g., assert
resp.status_code == 200) immediately after the async_client.get call before
calling resp.json(), so the test fails fast if the list call fails.
- Around line 117-140: In test_can_delete_workspace_without_children, the
initial GET to async_client.get("/iam/workspaces") doesn't assert the response
status before accessing resp.json(), so failures surface as JSON-shape errors;
add an explicit assertion that resp.status_code == 200 (or use
resp.is_successful equivalent) immediately after the GET call to validate the
request succeeded before using resp.json(), and keep this check near the
async_client.get call and before locating the root workspace variable.
🧹 Nitpick comments (1)
src/api/infrastructure/migrations/versions/fdd9505854c5_fix_workspace_parent_fk_to_use_restrict.py (1)

49-52: Consider making the no-op downgrade explicit or implementing proper reversal.

A silent no-op creates a mismatch between Alembic's state (migration reverted) and the actual database (RESTRICT still in place). If someone later re-runs the upgrade, drop_constraint will fail because the constraint still exists with the same name.

Consider either:

  1. Implementing the actual reversal to the previous behavior (even if undesirable), or
  2. Raising an error to make it explicit that downgrade is unsupported.
Option 2: Make unsupported downgrade explicit
 def downgrade() -> None:
-    """Revert to previous state (no change needed, RESTRICT is correct)."""
-    # No-op: we don't want to revert to a broken state
-    pass
+    """Downgrade is not supported for this migration.
+
+    RESTRICT is the correct behavior and reverting would restore a broken state.
+    """
+    raise NotImplementedError(
+        "Downgrade not supported: reverting to non-RESTRICT behavior is not allowed"
+    )

jsell-rh and others added 3 commits February 10, 2026 12:14
- TenantService: Sort workspaces by depth before deletion (children first)
  to avoid FK constraint violations
- TenantBootstrapService: Use savepoints (begin_nested) for tenant and
  workspace creation to avoid PendingRollbackError on race conditions
- conftest.py: Use cast(CursorResult, ...) for DELETE execute results
  to satisfy mypy rowcount access
- test_workspace_api.py: Add status code assertions before accessing
  resp.json() in deletion tests
- Migration downgrade: Raise NotImplementedError instead of no-op pass

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The migration fdd9505854c5 was unnecessary - the original workspaces
table migration (205809969bf4) already specified ON DELETE RESTRICT
for the parent_workspace_id FK. The actual fix for parent deletion
was adding passive_deletes=True to the SQLAlchemy relationship.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@jsell-rh jsell-rh enabled auto-merge (squash) February 10, 2026 18:56
@jsell-rh jsell-rh merged commit 11db83f into main Feb 10, 2026
10 checks passed
@jsell-rh jsell-rh deleted the jsell/feat/AIHCM-147-auto-workspace branch February 10, 2026 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant