diff --git a/hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md b/hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md
new file mode 100644
index 0000000..04d85f7
--- /dev/null
+++ b/hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md
@@ -0,0 +1,933 @@
+# Spike Report: HyperFleet E2E Test Automation Run Strategy
+
+**JIRA Story:** HYPERFLEET-532
+**Status:** Draft
+**Focus:** Deployment lifecycle management, resource isolation, and parallel Test Run execution safety
+
+---
+
+## 1. Problem Statement
+
+HyperFleet E2E testing validates system-level behavior across multiple cooperating components, including:
+
+- HyperFleet API
+- Sentinel
+- Adapter framework (multiple adapter types)
+- Messaging broker (Topics / Subscriptions)
+
+As E2E coverage expands and test pipelines begin executing in parallel, the current approach lacks a clearly defined **E2E test run strategy** to govern:
+
+- Deployment lifecycle ownership
+- Resource isolation boundaries
+- Race condition prevention in concurrent executions
+- Reliable cleanup and observability
+
+This results in:
+
+- Flaky test failures caused by shared resources
+- Unclear ownership of deployed components
+- Orphaned Kubernetes and broker resources
+- Limited scalability of parallel pipelines
+
+This spike defines a **comprehensive E2E test automation run strategy**, focusing on how tests are **deployed, isolated, coordinated, and cleaned up**, rather than on individual test case logic.
+
+---
+
+## 2. Goals and Non-Goals
+
+### 2.1 Goals
+
+This spike aims to define a strategy that:
+
+- Enables **safe parallel execution** of multiple Test Runs
+- Ensures **strong resource isolation** between test runs
+- Clearly defines **deployment lifecycle ownership**
+- Prevents race conditions by design
+- Supports **dynamic adapter deployment and removal** (hot-plugging)
+- Improves **debuggability and maintainability**
+- Establishes reusable patterns for future E2E expansion
+
+---
+
+### 2.2 Non-Goals
+
+This spike explicitly does **not** cover:
+
+- Individual test case implementation
+- CI/CD pipeline configuration
+- Performance or load testing considerations
+- External environments not related to HyperFleet (such as cloud resources)
+
+---
+
+## 3. Core Design Principles
+
+### 3.1 Test Run as the Primary Isolation Unit
+
+All test infrastructure, configuration, and resources are scoped to a **single Test Run**.
+
+A Test Run is the smallest unit of:
+
+- Isolation
+- Resource ownership
+
+---
+
+### 3.2 Explicit Lifecycle Ownership
+
+Every component participating in E2E testing must have clearly defined ownership for:
+
+- Creation
+- Runtime management
+- Teardown
+
+Implicit or shared ownership is considered a design flaw.
+
+---
+
+### 3.3 Isolation Over Optimization
+
+When trade-offs exist, this strategy prioritizes:
+
+> Reliability, isolation, and debuggability over startup speed or resource reuse.
+
+---
+
+## 4. E2E Test Run Model
+
+### 4.1 Test Run Definition
+
+A **Test Run** represents one or more E2E test cases executed sequentially as a single unit.
+
+Each Test Run has:
+
+- A globally unique **Test Run ID**
+- A well-defined lifecycle: `setup → execute → teardown`
+- Exclusive ownership of all resources it creates
+
+---
+
+### 4.2 Test Run Identification
+
+Each Test Run generates a unique identifier (Test Run ID) derived from:
+
+- CI-provided environment variable (when available)
+- Unix timestamp with high entropy
+- Customized with random components
+
+**Example**: Using `time.Now().UnixNano()` generates a 19-digit number: `1738152345678901234`, resulting in namespace: `e2e-1738152345678901234`.
+
+The Test Run ID is consistently applied to:
+
+- Kubernetes Namespaces
+- Resource names
+- Broker Topics and Subscriptions
+- Labels and annotations
+
+Namespaces are additionally labeled to indicate execution context:
+
+- Label `ci` distinguishes CI pipeline runs (`yes`) from local developer runs (`no`)
+- Enables context-appropriate retention policies
+- Does not affect test execution behavior
+
+This ensures **traceability** and **collision avoidance**.
+
+---
+
+### 4.3 Test Run Lifecycle
+
+Each Test Run follows a well-defined lifecycle:
+
+```
+Create Namespace
+ ↓
+Deploy Infrastructure
+ ↓
+Infrastructure Ready
+ ↓
+Execute Test Suites
+ ↓
+Cleanup
+```
+
+**Infrastructure Deployment** includes:
+- Database (PostgreSQL, deployed with API)
+- API and Sentinel
+- Broker connectivity
+- Custom Resource Definitions (CRDs)
+- Fixture Adapter
+
+**Infrastructure Ready** means:
+- All infrastructure components are healthy
+- Fixture Adapter is operational
+- Test suites can execute independently
+- No functional adapters are deployed yet
+
+**Test Suite Execution**:
+- Suites execute sequentially
+- Each suite may deploy/remove functional adapters as needed
+- Environment state persists across suites within the same Test Run
+
+**Cleanup**:
+- Delete cloud messaging resources (topics/subscriptions) tagged with test_run_id via cloud CLI
+- Uninstall infrastructure components via helm
+- Delete namespace
+- See Section 9 for detailed cleanup and retention policy
+
+---
+
+### 4.4 Fixture Adapter
+
+**Problem**: Core Suite needs to validate HyperFleet framework behavior (event flow, status aggregation, error handling). Functional adapters have external dependencies (cloud APIs, GCP projects) and cannot provide the controlled, repeatable scenarios needed for framework testing. What type of adapter should Core Suite use?
+
+**Decision**: Fixture Adapter in dedicated repository
+
+Build a test-specific adapter (`adapter-fixture` repository) based on hyperfleet-adapter framework that enables controlled testing of framework behaviors without external dependencies.
+
+**Rationale**:
+- **No cloud resources needed**: Core Suite tests framework data flow, doesn't need GCP projects, AWS accounts, or cloud credentials
+- **No auth configuration needed**: Eliminates setup complexity and credential management
+- **Error injection support**: Can simulate adapter failures, delays, and error conditions for framework testing
+- **Fast execution**: No external dependencies enables fast, stable, reproducible tests
+- **Real adapter framework**: Uses actual hyperfleet-adapter framework mechanisms (preconditions, resource management, status reporting)
+
+---
+
+#### 4.4.1 Framework Behaviors to Test
+
+Core Suite validates framework-level behaviors:
+
+| Framework Behavior | What Core Suite Validates | Fixture Adapter Role |
+|-------------------|---------------------------|---------------------|
+| **Event flow** | API → Sentinel → Broker → Adapter → Status update | Subscribe to events, report status back |
+| **Status aggregation** | Framework merges adapter conditions into resource status | Report different condition combinations (Applied, Available, Health) |
+| **Async processing** | Framework waits for adapter status reporting | Introduce controllable delays before status reporting |
+| **Error handling** | Framework handles adapter failures | Report failure conditions (Applied=False, Available=False) |
+| **Concurrent processing** | Framework handles multiple resources | Process multiple events in parallel |
+
+---
+
+#### 4.4.2 Design Approach
+
+**Repository**: `openshift-hyperfleet/adapter-fixture`
+
+**Based on hyperfleet-adapter framework**:
+- Uses real adapter mechanisms (preconditions, resources, post-processing, status reporting)
+- Subscribes to broker events
+- Reports standard status conditions (Applied, Available, Health)
+- Deployed once per Test Run as infrastructure component
+
+**Events consumed**:
+- `cluster.created` - New cluster created via API
+- `nodepool.created` - New nodepool created
+
+**Status conditions reported**:
+- **Applied**: Resources created successfully? (True/False)
+- **Available**: Workload completed successfully? (True/False)
+- **Health**: Adapter operating normally? (True/False)
+
+**Control modes provided**:
+- **Immediate success**: Report success immediately (test basic event flow)
+- **Delayed success**: Delay N seconds, then report success (test async status aggregation)
+- **Failure**: Report failure immediately (test error handling)
+- **Transient failure**: Fail N times, then succeed (test retry logic)
+
+**Test control mechanism**:
+- **adapter-fixture provides**: Controllable behavior mechanism per resource (e.g., delay before status report, report failure conditions)
+ - Implementation approach: resource labels (HyperFleet API supports labels field for metadata)
+ - Constraint: Must not require adapter reconfiguration or restart between tests
+- **e2e tests use**: Create resources with control labels to trigger specific Fixture Adapter behaviors for testing framework responses
+
+---
+
+#### 4.4.3 Complementary Testing Strategy
+
+**Fixture Adapter + Core Suite**:
+- **What**: Framework data flow validation (API → Sentinel → Broker → Adapter → Status reporting)
+- **Why Fixture**: No cloud resources needed, no auth configuration needed, supports error injection
+- **Focus**: Event flow, status aggregation, error handling
+
+**Functional Adapters + Adapter Suite**:
+- **What**: Adapter implementation validation (configuration loading, K8s resource creation, status reporting)
+- **Why Functional**: Tests real adapter logic with actual hyperfleet-adapter framework
+- **Focus**: Adapter creates correct K8s resources (Job, Configmap, Namespace, Manifest, etc.)
+
+---
+
+#### 4.4.4 Implementation Priority
+
+**Fixture Adapter is a new component** that requires design and implementation. To avoid blocking e2e testing progress, a phased approach is recommended:
+
+**Phase 1 (Immediate): Use adapter-landing-zone for happy-path Core Suite**
+- **What**: Use existing adapter-landing-zone as temporary substitute for basic framework testing
+- **Coverage**: Happy-path event flow (API → Sentinel → Broker → Adapter → Status reporting)
+- **Benefits**:
+ - No new component development required
+ - Tests real adapter framework mechanisms
+ - Creates observable K8s resources (Namespace, ServiceAccount)
+ - Can use RabbitMQ instead of GCP Pub/Sub (eliminates cloud resource dependency)
+- **Limitations**:
+ - Still requires K8s cluster + kubeconfig (auth configuration)
+ - Cannot test error injection scenarios (failure handling, retry logic)
+ - Creates real resources in test cluster
+
+**Phase 2 (Future): Implement Fixture Adapter for comprehensive Core Suite**
+- **What**: Build dedicated adapter-fixture repository
+- **Coverage**: Complete framework behavior testing (success, error handling, retry logic, async processing)
+- **Benefits**:
+ - No cloud resources needed
+ - No auth configuration needed
+ - Full error injection support
+ - Comprehensive framework validation
+
+**Rationale for phased approach**:
+- adapter-landing-zone provides immediate value for basic framework testing
+- Fixture Adapter implementation can proceed in parallel without blocking e2e progress
+- Phase 1 establishes Core Suite structure and CI integration
+- Phase 2 expands coverage to error scenarios and removes remaining dependencies
+
+---
+
+## 5. Deployment Lifecycle Strategy
+
+### 5.1 One Namespace per Test Run
+
+Each Test Run is assigned a **dedicated Kubernetes Namespace**.
+
+This Namespace serves as the hard isolation boundary for:
+
+- API
+- Sentinel
+- Adapters
+- Supporting services (databases, brokers, etc.)
+
+**Rationale:**
+
+- Eliminates cross-test interference
+- Simplifies cleanup semantics
+- Improves debugging clarity
+- Avoids complex naming or locking schemes
+
+---
+
+### 5.2 Namespace Naming Convention
+
+Namespace names follow a consistent pattern for operational clarity:
+
+```
+e2e-{TEST_RUN_ID}
+```
+
+**Components**:
+- `e2e-`: Prefix indicating E2E test resources
+- `{TEST_RUN_ID}`: Unique test run identifier
+
+**Rationale**:
+- Test Run ID enables correlation of resources across test runs
+- Operational teams can identify E2E namespaces without inspecting labels
+
+---
+
+### 5.3 Component Lifecycle Ownership
+
+| Component | Lifecycle Owner | Scope | Notes |
+|--------------------|------------------|--------------|-------|
+| Namespace | Test Framework | Per Test Run | |
+| API | Test Framework | Per Test Run | |
+| Sentinel | Test Framework | Per Test Run | |
+| Fixture Adapter | Test Framework | Per Test Run | Infrastructure component |
+| Functional Adapter | Test Suite | Suite-scoped | Dynamically managed |
+| Broker Resources | Adapter/Sentinel | Per Test Run | |
+
+**Rule:**
+No component may create resources outside its Test Run Namespace without Test Run–level isolation.
+
+---
+
+### 5.4 Resource Labeling Strategy
+
+#### 5.4.1 Required Labels
+
+All E2E test namespaces must carry exactly three labels:
+
+1. **`ci`**: Execution context (`yes` | `no`)
+2. **`test-run-id`**: Test Run identifier
+3. **`managed-by`**: Ownership marker (`e2e-test-framework`)
+
+**Rationale**:
+- `ci`: Enables context-appropriate retention policies
+- `test-run-id`: Enables resource correlation and traceability
+- `managed-by`: Standard Kubernetes ownership marker
+
+---
+
+## 6. Resource Isolation Strategy
+
+### 6.1 Kubernetes Resource Isolation
+
+Isolation is achieved via:
+
+- Namespace-per-Test-Run
+- Consistent `test-run-id` labeling
+- Optional but recommended `ResourceQuota` and `LimitRange`
+
+This prevents:
+
+- Pod name collisions
+- Service discovery conflicts
+- Cross-test communication
+
+---
+
+### 6.2 Messaging and Broker Isolation
+
+Messaging resources (Topics / Subscriptions) are isolated using the Test Run ID.
+
+Common patterns include:
+
+- Run-scoped Topics
+- Run-scoped Subscriptions
+- Adapter-owned Subscription lifecycles
+
+This avoids:
+
+- Cross-test event delivery
+- Subscription reuse race conditions
+- Message leakage between runs
+
+**Cloud Resource Cleanup**:
+
+For cloud messaging resources (e.g., GCP Pub/Sub Topics and Subscriptions), the teardown phase explicitly deletes resources tagged with `test_run_id`:
+
+- Resources are tagged/labeled with Test Run ID during creation
+- Teardown script calls cloud CLI to delete tagged resources (e.g., `gcloud pubsub topics/subscriptions delete`)
+- Namespace deletion alone does not clean up cloud resources
+- This ensures no orphaned cloud resources remain after test completion
+
+---
+
+## 7. Race Condition Prevention
+
+Race conditions are prevented through **architectural isolation**, not runtime locking.
+
+### 7.1 Unique Resource Identification
+
+All externally visible resources include the Test Run ID in:
+
+- Names
+- Labels
+- Broker identifiers
+
+This guarantees uniqueness even under maximum concurrency.
+
+---
+
+### 7.2 No Shared Mutable State
+
+The strategy explicitly avoids:
+
+- Shared Namespaces
+- Shared Topics or Subscriptions
+- Shared databases
+- Shared API instances
+
+Shared mutable state is the primary source of E2E race conditions.
+
+---
+
+### 7.3 Parallel Test Run Execution Model
+
+Parallel pipelines are safe because:
+
+- Each Test Run executes within a sealed resource boundary
+- No global locks are required
+- Failures are contained within a single Namespace
+
+---
+
+## 8. Test Scenario Organization
+
+### 8.1 Lifecycle Management Model
+
+Test infrastructure is managed at the **Test Run level**, not per test case.
+
+- Infrastructure is deployed once per Test Run
+- All test suites share the same environment
+- Test cases focus on validation, not deployment
+
+This ensures:
+- Stable environment for workflow validation
+- Reduced setup overhead
+- Clear separation between infrastructure and behavior testing
+
+---
+
+### 8.2 Test Suite Types
+
+Test suites represent **validation focus**, not environment configurations.
+
+#### 8.2.1 Core Suite
+
+Validates HyperFleet framework behavior using Fixture Adapter.
+
+**Purpose**: Fast, stable testing of framework logic without external dependencies.
+
+**Environment**:
+- Core components (API, Sentinel, Broker)
+- Fixture Adapter with label-driven behavior (see Section 4.4 for control modes)
+- No functional adapters or real cloud services
+
+**Validates** (framework behavior):
+- **Event flow**: API → Sentinel → Broker → Adapter → API
+- **Async status aggregation**: Framework waits for adapter responses (test with Fixture delay modes)
+- **Error handling**: Framework handles adapter failures (test with Fixture failure modes)
+- **Retry logic**: Framework retries failed operations (test with Fixture transient-failure modes)
+- **Status reconciliation**: Framework merges adapter conditions into resource status
+- **Concurrent processing**: Framework handles multiple resources in parallel
+- **Resource lifecycle**: Cluster and NodePool create/update/delete workflows
+
+**Test Approach**:
+
+Tests control Fixture Adapter behavior via resource labels to validate framework responses:
+
+| Framework Behavior | Resource Labels | Validation |
+|-------------------|---------------------|------------|
+| Basic data flow | `behavior: immediate-success` | Cluster reaches Ready state |
+| Async aggregation | `behavior: delayed-success`
`delay-seconds: 30` | Framework waits 30s, then aggregates status |
+| Error handling | `behavior: failure`
`failure-reason: ValidationFailed` | Cluster enters Failed state with correct reason |
+| Retry logic | `behavior: transient-failure`
`failure-count: 3` | Framework retries 3 times, then succeeds |
+| Timeout handling | `behavior: timeout` | Framework times out after configured duration |
+
+**Example Flow**:
+
+```
+1. Test creates Cluster via API with labels: fixture.control/mode=delayed-success, fixture.control/delay-seconds=30
+2. API persists Cluster (including labels)
+3. Sentinel polls, detects new Cluster, publishes event
+4. Fixture Adapter consumes event, reads labels, waits 30s
+5. Fixture Adapter reports success status to API
+6. API updates Cluster status
+7. Test validates: Cluster phase = Ready (after ~30s)
+```
+
+**Characteristics**:
+- ✅ Fast execution: No external dependencies
+- ✅ Stable: Infrastructure never reconfigured, 100% reproducible
+- ✅ Comprehensive: Tests all framework behaviors via label-driven Fixture Adapter
+
+---
+
+#### 8.2.2 Adapter Suite
+
+Validates functional adapter deployment, implementation, and lifecycle management.
+
+**Purpose**: Test real adapter functionality with complete deployment → function → cleanup validation.
+
+**Environment**:
+- Core components deployed
+- Functional adapters hot-plugged with **flexible deployment granularity** (managed by test groups or individual test cases)
+
+**Validates**:
+- Adapter deployment and configuration loading
+- Adapter logic (K8s resource creation: Job, ConfigMap, etc.)
+- Error handling and retry logic
+- Status reporting (conditions, reasons, messages)
+- Adapter removal and cleanup completeness
+
+**Adapter Management Decision**:
+
+We evaluated two deployment granularities for managing adapter lifecycle:
+
+| Approach | Adapter Scope | When Deployed | When Removed | Trade-offs |
+|----------|---------------|---------------|--------------|------------|
+| **Test Group-level** (Ordered + BeforeAll/AfterAll) | Shared within a test group | Once per test group | After all tests in group | ✅ Faster (deploy once)
✅ Good for read-only tests
⚠️ Tests in group share adapter state |
+| **Test Case-level** (BeforeEach/AfterEach) | Isolated per individual test | Before each test case | After each test case | ✅ Complete isolation
✅ No state pollution
❌ Slower (deploy per test) |
+
+**Decision: Support both granularities within Adapter Suite**
+
+**Rationale**:
+- Different test types have different isolation needs
+- Read-only validations (e.g., check DNS records) benefit from Test Group-level sharing
+- State-changing tests (e.g., error injection, config changes) require Test Case-level isolation
+- Mixed approach optimizes for both speed and test quality
+
+**Test Organization**:
+- Test groups use scoped `Describe` blocks to control adapter lifecycle
+- Test Group-level: `Describe` + `Ordered` + `BeforeAll`/`AfterAll`
+- Test Case-level: `Describe` + `BeforeEach`/`AfterEach`
+- Multiple test groups can coexist with different strategies
+
+---
+
+### 8.3 Suite Execution Order
+
+**Recommended Order**:
+
+Within a Test Run, suites typically execute in this order:
+
+1. **Core Suite** - Validates framework data flow
+2. **Adapter Suite** - Validates functional adapter implementation
+
+**Rationale**:
+- Core Suite runs first to validate infrastructure readiness
+- Core Suite provides fast feedback (framework-level issues)
+- Adapter Suite tests functional adapter implementation after infrastructure validated
+
+**Flexibility**:
+- Suites can run independently if infrastructure is ready
+- Multiple Test Runs can execute in parallel, each isolated in separate namespace (e2e-{TEST_RUN_ID})
+
+---
+
+### 8.4 Test Organization Guidelines
+
+**Problem**: When should tests use Test Group-level vs Test Case-level adapter deployment?
+
+**Decision Matrix**:
+
+| Test Characteristics | Recommended Strategy | Rationale |
+|---------------------|---------------------|-----------|
+| Read-only validations (check records, metrics, status) | Test Group-level | Tests don't interfere, share setup cost |
+| Independent functional checks (no state changes) | Test Group-level | Can reuse adapter safely |
+| Error injection scenarios | Test Case-level | State contamination risk, need fresh adapter |
+| Configuration variations | Test Case-level | Different adapter configs required |
+| State-changing operations (update, delete) | Test Case-level | Side effects prevent reuse |
+
+**Conceptual Structure**:
+
+```
+Describe("DNS Adapter Suite", func() {
+
+ // Test Group 1: Shared adapter (Test Group-level)
+ Describe("Deployment Validation", Ordered, func() {
+ var adapter *Adapter
+ BeforeAll: Deploy adapter once
+ It: Test deployment correctness
+ It: Test configuration loading
+ It: Test subscription registration
+ AfterAll: Remove adapter
+ })
+
+ // Test Group 2: Shared adapter (Test Group-level)
+ Describe("Functional Tests", Ordered, func() {
+ var adapter *Adapter
+ BeforeAll: Deploy adapter + create test data
+ It: Validate DNS record creation
+ It: Validate status reporting
+ It: Validate metrics
+ AfterAll: Cleanup test data + remove adapter
+ })
+
+ // Test Group 3: Isolated adapters (Test Case-level)
+ Describe("Error Scenarios", func() {
+ var adapter *Adapter
+ BeforeEach: Deploy fresh adapter
+ It: Test error handling (invalid domain)
+ It: Test retry logic
+ AfterEach: Remove adapter
+ })
+})
+```
+
+**Key Principles**:
+- **Test Group-level** (Describe + Ordered + BeforeAll/AfterAll): Group related read-only tests within a focused Describe block
+- **Test Case-level** (Describe + BeforeEach/AfterEach): Isolate state-changing or error tests
+- **Scoped test groups**: Each Describe block defines a focused scope for adapter lifecycle management
+
+---
+
+### 8.6 State Management and Suite Independence
+
+**State Ownership Model**:
+
+Test Run state is categorized by lifetime and ownership:
+
+| State Type | Lifetime | Owner | Examples |
+|------------|----------|-------|----------|
+| Infrastructure State | Test Run | Test Framework | Namespace, API, Sentinel, Fixture Adapter |
+| Adapter State | Test Group or Test Case | Test Group (Describe block) | Functional adapter pods, subscriptions |
+| Test Data | Test Case | Test Case | Clusters, NodePools, test-specific resources |
+
+**Isolation Principles**:
+
+1. **Infrastructure persists** - Core components remain active throughout the Test Run
+2. **Adapters are ephemeral** - Created and removed by test groups (Describe blocks) or individual test cases
+3. **Test data is scoped** - Each test case manages its own test resources
+4. **Unique naming prevents collision** - Resources use unique identifiers to avoid cross-test interference
+
+**Suite Independence**:
+
+- Suites can run independently if infrastructure is ready
+- Suite execution strategy:
+ - **Fail-fast**: Core suite failures (API, Sentinel, Broker) block dependent suites
+ - **Fail-tolerant**: Independent suite failures are collected without blocking independent suites
+ - Ensures early termination on infrastructure failures while maximizing test coverage
+- Each suite validates its prerequisites at startup
+
+**Cleanup Responsibility**:
+
+- Test cases and suites clean their own state (adapters, test data)
+- Infrastructure cleanup handled by Test Framework (see Section 9 for retention policy)
+
+---
+
+## 9. Resource Management and Cleanup
+
+### 9.1 Cleanup Ownership Model
+
+Cleanup is a shared responsibility between two actors:
+
+1. **E2E Test Flow**: Responsible for setting retention policy and deleting passed tests
+2. **Reconciler Job**: Responsible for enforcing TTL and handling edge cases
+
+No single component owns all cleanup. This separation prevents single points of failure.
+
+---
+
+### 9.2 Retention Policy
+
+#### 9.2.1 Default Retention (Safe Fallback)
+
+All namespaces are annotated with a default retention policy at creation:
+
+- **Default TTL**: 2 hours from creation
+- **Purpose**: Safety net if E2E flow is interrupted or fails before updating retention
+- Ensures orphaned namespaces are automatically cleaned up
+
+#### 9.2.2 Test Result-Based Retention
+
+E2E flow updates namespace retention annotations based on test outcome:
+
+| Test Result | CI Context | Local Context | Retention |
+|-------------|------------|---------------|-----------|
+| **Passed** | Any | Any | 10 minutes |
+| **Failed** | `ci=yes` | - | 24 hours |
+| **Failed** | `ci=no` | `ci=no` | 6 hours |
+
+**Rationale**:
+- Passed tests have minimal debugging value → short retention conserves quota
+ - 10-minute window prevents race conditions between E2E flow and reconciler deletion
+- Failed tests need retention for post-mortem
+ - CI (24h): Global team across time zones
+ - Local (6h): Developer actively investigating
+- Default 2h retention: Covers interrupted E2E flows
+
+#### 9.2.3 Retention Override
+
+Environment-based configuration allows overriding default retention policy.
+
+**Use Cases**:
+- Extended debugging sessions
+- Demonstration environments
+- Manual investigation
+
+Override values are stored in namespace annotations for reconciler consumption.
+
+---
+
+### 9.3 Cleanup Reconciliation
+
+#### 9.3.1 Reconciler Responsibilities
+
+A scheduled reconciler job enforces TTL-based cleanup:
+
+- Runs periodically (frequency configurable, typically 30 minutes)
+- Scopes to namespaces labeled as E2E test framework managed
+- Deletes namespaces based on retention annotation expiry
+
+**Simplicity Principle**: Reconciler does not distinguish between:
+- Normal vs orphaned namespaces
+- CI vs local runs
+
+All policy decisions are encoded in namespace annotations. Reconciler is stateless.
+
+---
+
+### 9.4 Orphaned Resource Handling
+
+**Definition**: Orphaned resources occur when E2E flow is interrupted before setting final retention.
+
+**Handling**:
+- No special orphan detection needed
+- Default 2-hour retention set at creation covers this case
+- Reconciler treats orphans identically to any expired namespace
+
+**Monitoring**: High orphan rate (inferred from default retention deletions) indicates E2E flow reliability issues.
+
+---
+
+### 9.5 Cloud Resource Cleanup
+
+**Scope**: Cloud messaging resources (GCP Pub/Sub Topics and Subscriptions) require explicit cleanup beyond namespace deletion.
+
+**Cleanup Process**:
+
+1. **Tagging**: All cloud resources created during Test Run are tagged/labeled with `test_run_id`
+2. **Teardown**: E2E cleanup script explicitly deletes cloud resources via cloud CLI:
+ - List resources filtered by `test_run_id` tag
+ - Delete Topics and Subscriptions matching the Test Run ID
+ - Example: `gcloud pubsub topics delete --filter="labels.test_run_id=1738152345678901234"`
+3. **Reconciler**: Periodically scans for orphaned cloud resources (tagged but older than retention TTL) and deletes them
+
+**Why Explicit Cleanup**:
+- Kubernetes namespace deletion does not remove cloud resources
+- Cloud resources incur costs and quota consumption
+- Orphaned cloud resources can accumulate over time
+
+**Implementation Note**: Cloud resource cleanup happens before namespace deletion in teardown sequence to ensure cleanup script has cluster access.
+
+---
+
+## 10. Testing Infrastructure Considerations
+
+### 10.1 Image Build and Distribution
+
+**Image Architecture**:
+
+Test infrastructure uses two container images with distinct responsibilities:
+
+1. **Cloud Platform Tools** - Target cluster authentication
+ - Contains cloud provider CLIs (gcloud, aws, etc.)
+ - Runs as init container to generate cluster credentials
+ - Low change frequency (rebuilt only when cloud tooling updates)
+
+2. **E2E Test Framework** - Infrastructure deployment and test execution
+ - Contains helm CLI, test code, and deployment charts
+ - Manages entire Test Run lifecycle
+ - High change frequency (rebuilt on test code or chart changes)
+
+**Rationale**:
+- Adapter hot-plugging requires deployment tooling in test execution context
+- Infrastructure deployment is orchestrated by Test Framework (Section 5.4)
+- Separation by change frequency optimizes CI/CD build efficiency
+
+---
+
+## 11. Observability and Debugging
+
+Debuggability is enabled by:
+
+- One Namespace per Test Run
+- Consistent labeling and naming conventions
+- Clear lifecycle boundaries
+- Namespace retention on failure
+- Component version reporting
+
+**Version Transparency**:
+
+Test framework outputs component versions at Test Run start:
+- Core components (API, Sentinel, Broker)
+- Adapter (adapter-landing-zone in Phase 1, Fixture Adapter in Phase 2)
+- Functional adapters deployed during test execution (Phase 2)
+
+Version information is logged during infrastructure deployment phase, enabling correlation between test results and component versions for failure investigation.
+
+Engineers can:
+
+- Inspect all failed-test resources in a single Namespace
+- Correlate logs, events, and message flows
+- Reproduce failures with exact component versions
+- Identify version-specific issues
+
+---
+
+## 12. Open Questions and Follow-Ups
+
+No open questions at this time. Fixture Adapter design is covered in Section 4.4.
+
+---
+
+## 13. Action Items and Next Steps
+
+Implementation follows a phased approach to avoid blocking e2e testing progress (see Section 4.4.4 for Fixture Adapter phasing strategy).
+
+---
+
+### 13.1 Phase 1: MVP with adapter-landing-zone (Immediate)
+
+**Goal**: Establish e2e testing infrastructure with happy-path Core Suite validation using existing adapter-landing-zone.
+
+**HYPERFLEET-XXX: Container Image Architecture**
+- [ ] Build Cloud Platform Tools image (gcloud, aws cli, kubeconfig generation)
+- [ ] Build E2E Test Framework image (helm cli, test code, deployment charts)
+- [ ] Set up image build pipeline
+
+**HYPERFLEET-XXX: Test Run Lifecycle**
+- [ ] Implement Test Run ID generation
+- [ ] Implement namespace creation with isolation labels (test-run-id, ci, managed-by)
+- [ ] Implement infrastructure deployment via helm (API, Sentinel, Broker, adapter-landing-zone)
+- [ ] Configure adapter-landing-zone with RabbitMQ broker (no GCP dependency)
+- [ ] Add infrastructure readiness checks
+- [ ] Output component versions at Test Run start (API, Sentinel, Broker, adapter-landing-zone)
+- [ ] Implement cleanup: cloud resource deletion (topics/subscriptions tagged with test_run_id) + helm uninstall + namespace deletion
+
+**HYPERFLEET-XXX: Core Suite (Phase 1 - with adapter-landing-zone)**
+- [ ] Implement Core Suite test cases (happy-path cluster/nodepool lifecycle)
+- [ ] Validate framework data flow: API → Sentinel → Broker → Adapter → API
+- [ ] Add infrastructure health validation
+
+**HYPERFLEET-XXX: E2E Test Run Strategy Guide (Phase 1)**
+- [ ] Document Test Run lifecycle for developers
+- [ ] Document Core Suite basics
+- [ ] Document basic cleanup and troubleshooting
+
+---
+
+### 13.2 Phase 2: Fixture Adapter and Adapter Suite (Future)
+
+**Goal**: Implement Fixture Adapter for comprehensive framework testing and Adapter Suite for functional adapter validation.
+
+**Prerequisites**: Requires HyperFleet system to support runtime adapter hot-plugging (dynamic adapter deployment without API/Sentinel restart).
+
+**HYPERFLEET-XXX: Fixture Adapter**
+- [ ] Implement Fixture Adapter in dedicated repository (adapter-fixture)
+- [ ] Implement label-driven control modes (immediate success, delayed success, failure, transient failure)
+- [ ] Implement event consumption (cluster.created, nodepool.created)
+- [ ] Implement status reporting (Applied, Available, Health conditions)
+- [ ] Add Fixture Adapter to infrastructure helm chart
+- [ ] Write Fixture Adapter unit tests
+
+**HYPERFLEET-XXX: Core Suite (Phase 2 - with Fixture Adapter)**
+- [ ] Replace adapter-landing-zone with Fixture Adapter in Core Suite
+- [ ] Implement error injection tests (failure handling, retry logic)
+- [ ] Implement async processing tests (delayed status reporting)
+- [ ] Implement concurrent processing tests
+- [ ] Update Core Suite documentation with Fixture Adapter usage
+
+**HYPERFLEET-XXX: Adapter Suite**
+- [ ] Implement flexible adapter deployment strategies (Ordered + BeforeAll, BeforeEach/AfterEach)
+- [ ] Create adapter configuration testdata directory (different adapter configs for testing)
+- [ ] Write adapter deployment validation tests (config loading, subscription registration)
+- [ ] Write adapter implementation tests (K8s resource creation: Job, ServiceAccount, ConfigMap, etc.)
+- [ ] Add error handling and retry logic validation
+- [ ] Add cleanup completeness verification
+- [ ] Implement mixed strategy examples (read-only vs state-changing tests)
+
+**HYPERFLEET-XXX: E2E Test Run Strategy Guide (Phase 2)**
+- [ ] Write suite organization guide (Core Suite vs Adapter Suite)
+- [ ] Document test organization strategies (Ordered + BeforeAll, BeforeEach/AfterEach, mixed approach)
+- [ ] Create adapter deployment strategy examples
+
+---
+
+### 13.3 Post-MVP Enhancements
+
+The following enhancements are deferred to post-MVP:
+
+**HYPERFLEET-XXX: Retention Policy**
+- [ ] Implement namespace retention annotation logic
+- [ ] Add test result-based retention updates (passed: 10min, failed: 24h/6h)
+- [ ] Configure default 2-hour TTL for orphaned namespaces
+- [ ] Write retention policy unit tests
+
+**HYPERFLEET-XXX: Cleanup Reconciler Job**
+- [ ] Implement TTL-based namespace reconciler
+- [ ] Add orphaned resource detection and cleanup
+- [ ] Add orphaned cloud resource cleanup (topics/subscriptions filtered by test_run_id tag)
+- [ ] Configure reconciler schedule (30-minute default)
+- [ ] Add reconciler monitoring and alerts
+
+---
+
+**Document Status**: Draft for Review
+**Next Steps**: Team review and approval, then create implementation tickets