Skip to content

Conversation

@blt
Copy link
Collaborator

@blt blt commented Dec 2, 2025

What does this PR do?

This commit demonstrates how one might use lading in a k8s environment to determine the memory bounds for a target container, in this case Datadog Agent. The example used here is harsh and comes from Datadog
Agent's own Regression Detector experiment uds_dogstatsd_to_api.

Run like so:

./k8s/experiment.sh --total-limit 1200 --agent-memory 700 --trace-memory 100 --sysprobe-memory 300 --process-memory 100 --duration 600 --tags "purpose:smp-experiment,agent-limit:2048"

This invocation demonstrates a memory allocation that works for Agent under these conditions, results:

========================================
RESULT: SUCCESS
========================================
No restarts detected
Test duration: 600 seconds
Tags: purpose:smp-experiment,agent-limit:2048

Container memory usage:
  agent: 640.67 MB / 700 MB (91.5%)
  trace-agent: 31.36 MB / 100 MB (31.4%)
  system-probe: 266.26 MB / 300 MB (88.8%)
  process-agent: 48.00 MB / 100 MB (48.0%)
  TOTAL: 986.29 MB / 1200 MB (82.2%)

Instructions are present in the k8s/README.md for changing lading's configuration and Datadog Agent's own configuration.

Copy link
Collaborator Author

blt commented Dec 2, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@blt blt added the no-changelog label Dec 2, 2025 — with Graphite App
@blt blt marked this pull request as ready for review December 2, 2025 23:51
@blt blt requested a review from a team as a code owner December 2, 2025 23:51
@blt
Copy link
Collaborator Author

blt commented Dec 2, 2025

name: lading-config
namespace: default
data:
lading.yaml: |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JFYI - this could also be passed in as LADING_CONFIG, inlining lading config in a k8s manifest was my original motivation to support

if let Ok(env_var_value) = env::var("LADING_CONFIG") {
debug!("Using config from env var 'LADING_CONFIG'");
let config = parse_config(&env_var_value)?;
info!("Configuration file is valid");
Ok(config)

@blt blt force-pushed the blt/introduce_a_k8s_based_lading_example branch from c0fb66b to 2fd183a Compare December 3, 2025 16:50
@paulvanderende
Copy link

CC @NielsG-it, @pwillemse, @paulvanderende

Hi Brian, our team is looking at the PR and Will come back soon with their first experiences using it.

@pwillemse
Copy link

pwillemse commented Jan 14, 2026

I have tried "uds_dogstatsd_to_api" example and was able to run it. Now I want to add a regression test "file_to_blackhole_0ms_latency". But this scenario does not work out of the box. The logrotate_fs generator creates a fuse filesystem to store log files. Should I switch back to "logrotate" instead of "logrotate_fs"? If so, what does the throttle option?

blt added 2 commits January 14, 2026 10:36
This commit demonstrates how one might use lading in a k8s environment
to determine the memory bounds for a target container, in this case
Datadog Agent. The example used here is harsh and comes from Datadog
Agent's own Regression Detector experiment `uds_dogstatsd_to_api`.

Run like so:

```
./k8s/experiment.sh --total-limit 1200 --agent-memory 700 --trace-memory 100 --sysprobe-memory 300 --process-memory 100 --duration 600 --tags "purpose:smp-experiment,agent-limit:2048"
```

This invocation demostrates a memory allocation that works for Agent
under these conditions, results:

```
========================================
RESULT: SUCCESS
========================================
No restarts detected
Test duration: 600 seconds
Tags: purpose:smp-experiment,agent-limit:2048

Container memory usage:
  agent: 640.67 MB / 700 MB (91.5%)
  trace-agent: 31.36 MB / 100 MB (31.4%)
  system-probe: 266.26 MB / 300 MB (88.8%)
  process-agent: 48.00 MB / 100 MB (48.0%)
  TOTAL: 986.29 MB / 1200 MB (82.2%)
```

Instructions are present in the `k8s/README.md` for changing lading's
configuration and Datadog Agent's own configuration.

Signed-off-by: Brian L. Troutwine <brian.troutwine@datadoghq.com>
This commit updates the experiments to have now two, the original is
preserved as `k8s/uds_dogstatsd_to_api` and the new one demonstrates
the use of logrotate.

Signed-off-by: Brian L. Troutwine <brian.troutwine@datadoghq.com>
@blt blt force-pushed the blt/introduce_a_k8s_based_lading_example branch from 2fd183a to 6cc28bb Compare January 14, 2026 22:09
@blt
Copy link
Collaborator Author

blt commented Jan 14, 2026

I have tried "uds_dogstatsd_to_api" example and was able to run it. Now I want to add a regression test "file_to_blackhole_0ms_latency". But this scenario does not work out of the box. The logrotate_fs generator creates a fuse filesystem to store log files. Should I switch back to "logrotate" instead of "logrotate_fs"? If so, what does the throttle option?

Ah you are right. We run the logrotate_fs experiments on Linux machines and I haven't tested that in a docker container. Meanwhile, I added an example in 6cc28bb of the use of logrotate which should work for you. Please do let me know if it does not.

@pwillemse
Copy link

pwillemse commented Jan 19, 2026

The test uds_dogstatsd_to_api makes use of a unix domain socket. Because we do not make uses of unix domain sockets, I tried to rework the example to an udp based test (there is a udp generator available). I tried two cases; one with a real datadog agent with port 8125 configured, and the other case with an udp blackhole. But in both cases no udp data was received. Should it be possible to send udp data? How to configure the sockeraddr?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants