feat: expand dashboard UX and enrich stem observability metadata#18
feat: expand dashboard UX and enrich stem observability metadata#18kingwill101 merged 14 commits intomasterfrom
Conversation
…gic to dedicated files - Moved event rendering logic to `events.dart` and created `buildEventsContent` function. - Created `options.dart` for task and worker page options classes. - Moved overview content rendering to `overview.dart` with `buildOverviewContent` function. - Added shared utility functions to `shared.dart` for metric cards and queue table rows. - Implemented task rendering in `tasks.dart` with `buildTasksContent` function. - Created worker rendering logic in `workers.dart` with `buildWorkersContent` function.
…ges and add dashboard to workspace
|
Caution Review failedFailed to post review comments 📝 WalkthroughWalkthroughAdds basePath mounting and routing to the dashboard, alerting and auditing in dashboard state, expanded dashboard models and data-source APIs (task/workflow queries, replay/revoke), a modular Tailwind-based UI split into page builders, runtime/resiliency improvements in Stem dashboard service, and many example/config/test updates. Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Package publishing
Documentation at https://github.com/dart-lang/ecosystem/wiki/Publishing-automation. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3f8b0c9ffd
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
packages/stem/lib/src/core/stem.dart
Outdated
| // Use explicit wire headers as the only source of producer parent context. | ||
| // This avoids accidental context bleed from unrelated async operations | ||
| // while still supporting distributed parent propagation via traceparent. | ||
| final producerParentContext = tracer.extractTraceContext(headers); |
There was a problem hiding this comment.
Keep enqueue spans attached to ambient trace context
This uses only wire headers to derive the enqueue span parent. When Stem.enqueue is called inside an active span without a prefilled traceparent header (the common in-process producer path), extractTraceContext falls back to a fresh context and stem.enqueue starts a new root span, breaking parent-child trace continuity from request/workflow spans unless every caller manually injects headers first.
Useful? React with 👍 / 👎.
| final queueFuture = service.fetchQueueSummaries(); | ||
| final workerFuture = service.fetchWorkerStatuses(); | ||
| final taskFuture = service.fetchTaskStatuses(limit: 120); |
There was a problem hiding this comment.
The poll loop now starts three futures and awaits them sequentially; if the first await throws, the remaining futures are left unawaited, and any later exception from them is emitted as an uncaught async error in the zone. In periodic polling this can generate background uncaught errors and destabilize refresh behavior; awaiting these together (or explicitly handling all failures) avoids leaked exceptions.
Useful? React with 👍 / 👎.
| request = await client.postUrl(uri); | ||
| request.headers.contentType = ContentType.json; | ||
| request.add(utf8.encode(jsonEncode(payload))); | ||
| final response = await request.close(); |
There was a problem hiding this comment.
Add timeout to alert webhook HTTP calls
Alert delivery awaits request.close() without any timeout and is in the synchronous _poll() path. If a webhook endpoint hangs (for example, slow connect or stalled response), that poll cycle blocks until transport timeout, and because polls are serialized, dashboard refresh/alert processing is effectively paused for that period. A bounded timeout around the HTTP request/response is needed to keep polling responsive.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
This PR broadens Stem’s operational UX and observability story by expanding the dashboard into a multi-view console, enriching trace/span metadata propagation across enqueue/consume/execute, and updating the microservice example + Grafana provisioning to showcase the new capabilities.
Changes:
- Expanded
stem_dashboardinto a modular, Tailwind-styled operations console (overview, tasks, jobs, workflows, workers, failures, audit, events, namespaces, search) with base-path mounting support. - Enriched tracing metadata and span attributes across
Stem.enqueueandWorkerconsume/execute paths; improved trace-context handling when headers are missing. - Updated worker retry republish flow to preserve optional payload signing; extended workflow/status metadata getters and added/updated unit tests.
Reviewed changes
Copilot reviewed 64 out of 68 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
pubspec.yaml |
Adds the dashboard package to the workspace. |
packages/stem_sqlite/pubspec.yaml |
Bumps artisanal dependency version. |
packages/stem_postgres/pubspec.yaml |
Bumps artisanal dependency version. |
packages/stem_cli/pubspec.yaml |
Bumps artisanal dependency version. |
packages/stem/test/unit/workflow/workflow_metadata_views_test.dart |
Adds tests for new workflow watcher/run-state metadata getters. |
packages/stem/test/unit/worker/worker_test.dart |
Adds coverage for signed retry republish behavior. |
packages/stem/test/unit/tracing/tracing_test.dart |
Expands tracing tests to assert richer metadata + missing-header trace behavior. |
packages/stem/test/unit/core/contracts_test.dart |
Adds tests for new TaskStatus typed metadata getters. |
packages/stem/lib/src/workflow/core/workflow_watcher.dart |
Adds watcher/resolution metadata getters and JSON parsing helpers. |
packages/stem/lib/src/workflow/core/run_state.dart |
Adds suspension metadata getters for workflow runs. |
packages/stem/lib/src/worker/worker.dart |
Adds span-phase attributes, richer span metadata, and optional signing on retry republish. |
packages/stem/lib/src/observability/tracing.dart |
Avoids ambient context chaining when trace headers are missing. |
packages/stem/lib/src/observability/metrics.dart |
Wires OTLP gRPC span exporter into the metrics runtime initialization. |
packages/stem/lib/src/core/stem.dart |
Enriches enqueue span attributes, propagates metadata, and tightens parent context extraction. |
packages/stem/lib/src/core/contracts.dart |
Adds typed metadata getters on TaskStatus + parsing helpers. |
packages/stem/example/microservice/worker/pubspec.yaml |
Adds local override wiring for stem_memory. |
packages/stem/example/microservice/worker/bin/worker.dart |
Expands demo worker tasks/queues and improves progress + env configurability. |
packages/stem/example/microservice/worker/Dockerfile |
Fixes working directory for monorepo layout and ensures Dart is on PATH. |
packages/stem/example/microservice/schedules.example.yaml |
Updates demo schedules and adds richer workload args/meta. |
packages/stem/example/microservice/nginx.conf |
Improves reverse-proxy behavior for Grafana/Prometheus/Jaeger subpaths. |
packages/stem/example/microservice/grafana/provisioning/dashboards/stem.yml |
Adds Grafana provisioning for Stem dashboards. |
packages/stem/example/microservice/grafana/dashboards/stem-workers-queues.json |
New Grafana dashboard for worker/queue metrics. |
packages/stem/example/microservice/grafana/dashboards/stem-scheduler.json |
New Grafana dashboard for scheduler metrics. |
packages/stem/example/microservice/grafana/dashboards/stem-overview.json |
New Grafana overview dashboard for system-level metrics. |
packages/stem/example/microservice/grafana-datasources.yml |
Updates datasource URLs to align with nginx base paths. |
packages/stem/example/microservice/enqueuer/pubspec.yaml |
Adds local override wiring for stem_memory. |
packages/stem/example/microservice/enqueuer/bin/main.dart |
Adds auto-fill workload generator + multi-task/workflow demo enqueuing. |
packages/stem/example/microservice/enqueuer/Dockerfile |
Fixes working directory for monorepo layout and ensures Dart is on PATH. |
packages/stem/example/microservice/docker-compose.yml |
Updates paths, adds base-path config for Jaeger/Grafana, and configures workers/enqueuer env. |
packages/stem/example/microservice/dashboard/Dockerfile |
Updates dashboard Docker build paths for monorepo layout. |
packages/stem/example/microservice/beat/pubspec.yaml |
Adds local override wiring for stem_memory. |
packages/stem/example/microservice/beat/bin/beat.dart |
Applies observability exporters/config in the beat demo. |
packages/stem/example/microservice/beat/Dockerfile |
Fixes working directory for monorepo layout and ensures Dart is on PATH. |
packages/stem/example/microservice/README.md |
Updates docs for new paths, Grafana dashboards, and auto-fill demo traffic. |
packages/stem/CHANGELOG.md |
Documents the new tracing metadata, retry signing behavior, and example/dashboard updates. |
packages/dashboard/web/tailwind.input.css |
Introduces Tailwind base/components/utilities and dashboard styling primitives. |
packages/dashboard/test/services/sqlite_dashboard_service_test.dart |
Adds coverage for task status fetching + filters. |
packages/dashboard/test/server_test.dart |
Expands server tests for new routes, basePath mounting, tasks pages, and actions. |
packages/dashboard/test/dashboard_state_property_test.dart |
Updates test datasource implementation for new dashboard datasource API surface. |
packages/dashboard/test/dashboard_browser_test.dart |
Extends browser tests for new search UX content and datasource API changes. |
packages/dashboard/tailwind.config.js |
Adds Tailwind configuration targeting Dart UI templates. |
packages/dashboard/pubspec.yaml |
Switches to workspace resolution and updates overrides accordingly. |
packages/dashboard/lib/src/ui/workflows.dart |
Adds workflows view rendering. |
packages/dashboard/lib/src/ui/workers.dart |
Adds workers view rendering + controls and recovery sections. |
packages/dashboard/lib/src/ui/task_detail.dart |
Adds task detail view with workflow timeline rendering and actions. |
packages/dashboard/lib/src/ui/shared.dart |
Introduces shared UI helpers (formatting, tables, pills, actions). |
packages/dashboard/lib/src/ui/search.dart |
Adds global search view with saved views + scoped search. |
packages/dashboard/lib/src/ui/paths.dart |
Adds basePath URL/route helpers for mounting and link rewriting. |
packages/dashboard/lib/src/ui/overview.dart |
Adds overview view + derived metrics/latency summaries. |
packages/dashboard/lib/src/ui/options.dart |
Adds strongly-typed per-page options models for query/form state. |
packages/dashboard/lib/src/ui/namespaces.dart |
Adds namespaces rollup view. |
packages/dashboard/lib/src/ui/jobs.dart |
Adds jobs/task-family rollup view. |
packages/dashboard/lib/src/ui/failures.dart |
Adds failures diagnostics grouping view + replay controls. |
packages/dashboard/lib/src/ui/events.dart |
Adds events feed view. |
packages/dashboard/lib/src/ui/event_templates.dart |
Tweaks event item template rendering used by events feed. |
packages/dashboard/lib/src/ui/audit.dart |
Adds audit log view. |
packages/dashboard/lib/src/state/dashboard_state.dart |
Adds polling + signatures + refresh signaling + audit/alerts/webhook delivery. |
packages/dashboard/lib/src/config/config.dart |
Adds config parsing for alert webhooks and thresholds. |
packages/dashboard/lib/dashboard.dart |
Expands public exports for embedding/mounting dashboard engine. |
packages/dashboard/bin/dashboard.dart |
Adds DASHBOARD_BASE_PATH support to standalone runner. |
packages/dashboard/README.md |
Documents dashboard usage and embedding/mounting support. |
packages/dashboard/CHANGELOG.md |
Changelog entry describing dashboard expansion and refactors. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
packages/stem/example/microservice/grafana/dashboards/stem-overview.json
Show resolved
Hide resolved
| if (normalizedBasePath.isEmpty) return html; | ||
|
|
||
| return html.replaceAllMapped( | ||
| RegExp(r'''(href|action|value)=("')/(?!/)([^"']*)\2'''), |
There was a problem hiding this comment.
prefixDashboardUrlAttributes regex looks wrong: (href|action|value)=("')/...\2 won't match either quote type because group 2 is the literal string "'. This likely prevents basePath prefixing from working. Consider changing the quote capture to something like (["']) and reusing that group in the backreference.
| RegExp(r'''(href|action|value)=("')/(?!/)([^"']*)\2'''), | |
| RegExp(r'''(href|action|value)=(["'])/(?!/)([^"']*)\2'''), |
| final queues = status.queues.isEmpty | ||
| ? '<span class="muted">—</span>' | ||
| : status.queues | ||
| .map((queue) => '<span class="pill">${queue.name}</span>') |
There was a problem hiding this comment.
buildWorkerRow injects queue.name into HTML without escaping. Queue names can originate from brokers/config and should be HTML-escaped to avoid XSS in the dashboard UI.
| .map((queue) => '<span class="pill">${queue.name}</span>') | |
| .map((queue) => '<span class="pill">${escapeHtml(queue.name)}</span>') |
| .join(' '); | ||
| return ''' | ||
| <tr> | ||
| <td class="font-semibold text-slate-100">${status.workerId}</td> |
There was a problem hiding this comment.
buildWorkerRow injects status.workerId into HTML without escaping. Worker IDs can be environment-controlled and should be passed through escapeHtml before rendering to prevent XSS.
| <td class="font-semibold text-slate-100">${status.workerId}</td> | |
| <td class="font-semibold text-slate-100">${escapeHtml(status.workerId)}</td> |
| <tbody> | ||
| ${filteredWorkers.isEmpty ? ''' | ||
| <tr> | ||
| <td colspan="7" class="muted">No heartbeats detected for namespace "${options.namespaceFilter ?? 'stem'}".</td> |
There was a problem hiding this comment.
The empty-state message interpolates options.namespaceFilter directly into HTML (No heartbeats detected for namespace "..."). This should be HTML-escaped to avoid XSS if a crafted namespace is provided via query params.
| <td colspan="7" class="muted">No heartbeats detected for namespace "${options.namespaceFilter ?? 'stem'}".</td> | |
| <td colspan="7" class="muted">No heartbeats detected for namespace "${escapeHtml(options.namespaceFilter ?? 'stem')}".</td> |
| return ''' | ||
| <tr class="queue-row group" data-queue-row="${summary.queue}"> | ||
| <td><span class="pill">${summary.queue}</span></td> | ||
| <td class="font-medium text-slate-100">${formatInt(summary.pending)}</td> | ||
| <td class="font-medium text-slate-100">${formatInt(summary.inflight)}</td> | ||
| <td class="font-medium text-slate-100">${formatInt(summary.deadLetters)}</td> | ||
| </tr> | ||
| <tr class="queue-detail" data-queue-detail="${summary.queue}"> |
There was a problem hiding this comment.
buildQueueTableRow renders summary.queue into both element content and data-* attributes without HTML-escaping. Since queue names can come from external systems, this can enable XSS/injected attributes; use escapeHtml consistently for queue values.
| return ''' | |
| <tr class="queue-row group" data-queue-row="${summary.queue}"> | |
| <td><span class="pill">${summary.queue}</span></td> | |
| <td class="font-medium text-slate-100">${formatInt(summary.pending)}</td> | |
| <td class="font-medium text-slate-100">${formatInt(summary.inflight)}</td> | |
| <td class="font-medium text-slate-100">${formatInt(summary.deadLetters)}</td> | |
| </tr> | |
| <tr class="queue-detail" data-queue-detail="${summary.queue}"> | |
| final escapedQueue = escapeHtml(summary.queue); | |
| return ''' | |
| <tr class="queue-row group" data-queue-row="$escapedQueue"> | |
| <td><span class="pill">$escapedQueue</span></td> | |
| <td class="font-medium text-slate-100">${formatInt(summary.pending)}</td> | |
| <td class="font-medium text-slate-100">${formatInt(summary.inflight)}</td> | |
| <td class="font-medium text-slate-100">${formatInt(summary.deadLetters)}</td> | |
| </tr> | |
| <tr class="queue-detail" data-queue-detail="$escapedQueue"> |
| HttpClientRequest? request; | ||
| try { | ||
| final client = HttpClient(); | ||
| request = await client.postUrl(uri); | ||
| request.headers.contentType = ContentType.json; | ||
| request.add(utf8.encode(jsonEncode(payload))); | ||
| final response = await request.close(); | ||
| if (response.statusCode >= 200 && response.statusCode < 300) { | ||
| recordAudit( | ||
| kind: 'alert', | ||
| action: key, | ||
| status: 'sent', | ||
| actor: 'system', | ||
| summary: 'Alert delivered to $url.', | ||
| ); | ||
| } else { | ||
| recordAudit( | ||
| kind: 'alert', | ||
| action: key, | ||
| status: 'error', | ||
| actor: 'system', | ||
| summary: 'Webhook returned HTTP ${response.statusCode} for $url.', | ||
| ); | ||
| } | ||
| client.close(force: true); |
There was a problem hiding this comment.
_sendAlertWebhooks creates an HttpClient inside the try and only closes it on the success path. If an exception is thrown after the client is created (e.g., DNS/handshake/write), the client is never closed. Consider creating the client outside the try (or using try/finally) and always closing it; also consider draining/consuming the response body before close to avoid socket/resource leaks.
|
Addressed the inline review concerns in follow-up commits:
Validation run:
|
|
Caution Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted. Error details |
Summary
stem_dashboardinto a multi-view operations console (tasks, jobs, workflows, workers, failures, audit, events, namespaces, search) with modular UI rendering and a Tailwind-driven layout system.Validation
dart test packages/stem/test/unit/tracing/tracing_test.dartdart test packages/stem/test/unit/worker/worker_test.dartSummary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests