Draft Pull Request: Feature/Container Metrics #29

Lumberj3ck · 2025-08-16T16:03:08Z

This pull request introduces the initial backend implementation for real-time container metrics, as discussed in issue #26. This is a draft PR to showcase the progress and discuss the current implementation before proceeding with the frontend and further backend refinements.

What's Done:

Container Metrics Polling:
Thread-Safe Ring Buffer:
Last Accessed Time Check:

Discuss

Client disconnect context canceling

On client disconnect, I have implemented context canceling for Master node, but we might have some issues if for example first client started metrics polling, then second client tries to access that containers plot and then goes to the other container to check, metris will not be accumulated for that client if first client stop and will start only when second client try to access. Maybe because of this issue, we might need prefer to not stop polling after client disconnect.

I'm looking forward to your feedback on the current implementation.

…r in one session

…s; last accessed time check

…tate

in client

Lumberj3ck · 2025-09-20T10:40:16Z

Client logic

Initiate metrics polling on stats tab clicking.

4588 if (key == 'Stats') {
4589     cmds._init_metrics_polling();
4590   }

Handle metrics response

5093 if ('Metrics' in notification.Content) {

If metrics received but we didn't receive stats inspector yet, wait

      // container.inspect.stats returned after container.metrics
      if (state.inspector.content.length == 0) {
        state.isLoading = false;
        // to make first request as soon as posible
        cmds._cancel_metrics_polling();
        cmds._init_metrics_polling();
        // do not process if no container.inspector.stats loaded
        break;
      }

After metrics handled as usual and stored inside of inspector. Whenever users clicks on Stats, client receives fully accumulated metrics from the begining.

I wasn't sure what is correct formating so I formated with this:

npx prettier --write client/assets/js/isaiah.js --tab-width 2 --single-quote --trailing-comma none --arrow-parens always

however I think something like this might also be the case:

npx prettier --write client/assets/js/isaiah.js --tab-width 2 --single-quote --trailing-comma es5 --arrow-parens always

I'm sorry for huge diff 🙏

I have tested feature with agents, stop container, restarting container, reloading on <R>
Update plot colors accordingly to theme change

Todo

Should we add buffering on client, so we don't overflow client with infinite metrics flow?
Add environment variable to control frequency of metrics polling both on client and server

Lumberj3ck · 2025-10-09T11:17:13Z

Hey Will!👋

I just wanted to remind about this pr and also summarise stuff which has been done.

Client-side logic

Metrics polling is initiated when the user clicks on the Stats tab:

if (key == 'Stats') {
    cmds._init_metrics_polling();
}

When the client receives a Metrics notification, it checks if container.inspect.stats has been loaded.
- If not yet loaded (i.e., metrics arrived first), polling is restarted to sync the first data batch as soon as possible.
- Once inspector data is ready, metrics are processed and stored as part of the inspector state.
This ensures the user always gets a fully accumulated metrics history when switching to the Stats tab.
Implemented polling cancellation and restart logic to prevent overlapping requests.
Plot colors now dynamically follow theme changes for a consistent UI experience.

Backend implementation

Architecture

I introduced a new component:

ContainerStatsManager — manages per-container metrics collection.
RingBuffer[T] — a generic, thread-safe circular buffer for efficient metric storage without memory growth.

Each container’s metrics are stored in a bounded ring buffer (size = 3000), overwriting old data automatically to prevent leaks or unbounded memory usage.

Concurrency and safety

All state-modifying operations in ContainerStatsManager and RingBuffer are guarded with RWMutex locks.
Each container can be polled independently in its own goroutine, linked to a session-wide context.Context, so that when the session ends, all related pollers stop cleanly.

Polling workflow

When the client sends the container.metrics command, the server:
1. Validates arguments and checks container state via ContainerInspect.
2. Updates the container’s lastAccessed timestamp.
3. If polling isn’t active, starts a new goroutine via PollMetrics().
4. Returns metrics accumulated since the last From index.
The poller itself:
- Fetches data with client.ContainerStatsOneShot().
- Computes CPU% and memory% using deltas between current and previous stats.
- Appends each new MetricPoint to the container’s ring buffer.
- Runs every 3 seconds and stops automatically if:
  - The container has been idle for >30 minutes, or
  - The session’s context is canceled.

Data structure

type MetricPoint struct {
	CpuMetric float64 `json:"cpu"`
	MemMetric float64 `json:"mem"`
	Timestamp int64   `json:"timestamp"`
}

These are stored per container in a bounded buffer:

ringbuf.NewRingBuffer

Server command addition

Added new case to the command handler:

case "container.metrics":

It handles request parsing, container state checking, poller initialization, and sending a notification with:

{
  "Metrics": [...],
  "From":  <next index>,
  "IsRunning": true  
}

Errors or inactive containers return an empty metrics array and "IsRunning": false.

Testing

I’ve tested with:

Multiple agents and hosts
Container stop/restart cycles
Page reloads

Todo / Open questions

Should we add client-side buffering to prevent overflow in very long-running sessions?
Should we add an env var to control metrics polling frequency (client & server)?

Would be great if you could take a look and maybe test it a bit — I’d really appreciate your feedback.

Thanks!
Alan

Lumberj3ck changed the title ~~### Draft Pull Request: Feature/Container Metrics~~ Draft Pull Request: Feature/Container Metrics Aug 16, 2025

Lumberj3ck force-pushed the feature/metrics-charts branch from ffc0e2e to 614d8ea Compare September 20, 2025 10:19

Lumberj3ck added 18 commits September 20, 2025 15:29

fix(server) docker cli availability check with timeout

b4bc702

chore(dependencies) updated dependencies, added build scripts

c213610

feat(containers): container metrics handler and polling implementation

e77d728

feat(container.metrics) polling retries on error

a252542

feat(container.metrics) clean on disconnect and linking metrics polle…

372e704

…r in one session

feat(container.metrics) added explanatory comments

14d6cdb

feat(container.metrics) thread safety implemented

67257bc

feat(container.metrics) implemented thread safe ring buffer

d449253

feat(container.metrics) ring buffer integrated

22362c0

feat(container.metrics) added memory metric; refactored entities name…

b0bc8a6

…s; last accessed time check

feat(container.metrics) refactored stats global variable with local s…

4f39e9e

…tate

feat(container.metrics) plot state moved to inspector

e954c8a

feat(container.metrics) plot style and corrected logic of plot rendering

8bb9547

feat(container.metrics) fire first tick immediately on server

1dcaeaa

feat(container.metrics) added timestamps for metric points

6ee1e7a

feat(container.metrics) refactored getting metrics points on client

a71f02f

feat(container.metrics) deduced formating for prettier

f6d609d

feat(container.metrics) receiveing container status on metrics polling

88408a2

in client

Lumberj3ck force-pushed the feature/metrics-charts branch from 614d8ea to 88408a2 Compare September 20, 2025 10:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft Pull Request: Feature/Container Metrics #29

Draft Pull Request: Feature/Container Metrics #29

Uh oh!

Lumberj3ck commented Aug 16, 2025

Uh oh!

Lumberj3ck commented Sep 20, 2025

Uh oh!

Lumberj3ck commented Oct 9, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Draft Pull Request: Feature/Container Metrics #29

Are you sure you want to change the base?

Draft Pull Request: Feature/Container Metrics #29

Uh oh!

Conversation

Lumberj3ck commented Aug 16, 2025

Discuss

Client disconnect context canceling

Uh oh!

Lumberj3ck commented Sep 20, 2025

Client logic

Todo

Uh oh!

Lumberj3ck commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Client-side logic

Backend implementation

Architecture

Concurrency and safety

Polling workflow

Data structure

Server command addition

Testing

Todo / Open questions

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Lumberj3ck commented Oct 9, 2025 •

edited

Loading