Skip to content

Conversation

@thomas-lebeau
Copy link
Collaborator

@thomas-lebeau thomas-lebeau commented Nov 19, 2025

Motivation

We want to avoid hardcoding the list of datacenter in the release process. To prepare for this, the first step is to get rid of monitorIdsByDatacenter. For this we're querying the logs API directly. Newly created DC might not have api or application key setup so we're skipping the monitors for thoses.
Another benefit from this new strategy for monitoring DC deployments is that we're able to only account for the errors related to the version of the SDK currently being deployed. This makes the release more resilient to unrelated monitor alerts.

Changes

  • remove monitorIdsByDatacenter in favor of a logs API query

Test instructions

Checklist

  • Tested locally
  • Tested on staging
  • Added unit tests for this change.
  • Added e2e/integration tests for this change.

Base automatically changed from thomas.lebeau/allow-undefined-api-key to main November 19, 2025 11:58
@thomas-lebeau thomas-lebeau force-pushed the thomas.lebeau/better-release-monitor-check branch from 108e473 to a7184af Compare November 21, 2025 06:34
@cit-pr-commenter
Copy link

cit-pr-commenter bot commented Nov 21, 2025

Bundles Sizes Evolution

📦 Bundle Name Base Size Local Size 𝚫 𝚫% Status
Rum 164.34 KiB 164.34 KiB 0 B 0.00%
Rum Profiler 4.32 KiB 4.32 KiB 0 B 0.00%
Rum Recorder 20.03 KiB 20.03 KiB 0 B 0.00%
Logs 56.14 KiB 56.14 KiB 0 B 0.00%
Flagging 944 B 944 B 0 B 0.00%
Rum Slim 121.62 KiB 121.62 KiB 0 B 0.00%
Worker 23.63 KiB 23.63 KiB 0 B 0.00%
🚀 CPU Performance
Action Name Base CPU Time (ms) Local CPU Time (ms) 𝚫%
RUM - add global context 0.0049 0.0051 +4.08%
RUM - add action 0.0137 0.0136 -0.73%
RUM - add error 0.0128 0.0114 -10.94%
RUM - add timing 0.0027 0.0027 0.00%
RUM - start view 0.0041 0.0031 -24.39%
RUM - start/stop session replay recording 0.0011 0.0007 -36.36%
Logs - log message 0.0198 0.0172 -13.13%
🧠 Memory Performance
Action Name Base Memory Consumption Local Memory Consumption 𝚫
RUM - add global context 25.75 KiB 25.53 KiB -227 B
RUM - add action 48.31 KiB 46.89 KiB -1.42 KiB
RUM - add timing 24.03 KiB 23.65 KiB -389 B
RUM - add error 55.78 KiB 55.10 KiB -697 B
RUM - start/stop session replay recording 23.54 KiB 23.31 KiB -230 B
RUM - start view 426.42 KiB 423.11 KiB -3.31 KiB
Logs - log message 44.58 KiB 44.56 KiB -23 B

🔗 RealWorld

@datadog-datadog-prod-us1
Copy link

datadog-datadog-prod-us1 bot commented Nov 21, 2025

✅ Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage
Patch Coverage: 100.00%
Overall Coverage: 77.24% (+0.00%)

View detailed report

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: a5ad496 | Docs | Datadog PR Page | Was this helpful? Give us feedback!

@thomas-lebeau thomas-lebeau force-pushed the thomas.lebeau/better-release-monitor-check branch from f97fd57 to d547c0e Compare November 21, 2025 08:46
@thomas-lebeau thomas-lebeau changed the title 👷 replace monitor checks by logs query 👷 rdo not rely on hardcoded list of DCs Nov 21, 2025
@thomas-lebeau thomas-lebeau changed the title 👷 rdo not rely on hardcoded list of DCs 👷 do not rely on hardcoded list of DCs Nov 21, 2025
@thomas-lebeau thomas-lebeau force-pushed the thomas.lebeau/better-release-monitor-check branch from 5dde1dc to 0f13224 Compare November 21, 2025 10:30
@thomas-lebeau thomas-lebeau force-pushed the thomas.lebeau/better-release-monitor-check branch from 0f13224 to 892ca7c Compare December 12, 2025 09:26
@thomas-lebeau
Copy link
Collaborator Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@thomas-lebeau thomas-lebeau force-pushed the thomas.lebeau/better-release-monitor-check branch 2 times, most recently from 9731eee to fd91473 Compare December 16, 2025 15:37
@thomas-lebeau thomas-lebeau changed the title 👷 do not rely on hardcoded list of DCs 👷 do not rely on monitors for deployment gate Dec 17, 2025
@thomas-lebeau thomas-lebeau force-pushed the thomas.lebeau/better-release-monitor-check branch from 9e7570b to 247005e Compare December 17, 2025 09:50
@thomas-lebeau thomas-lebeau force-pushed the thomas.lebeau/better-release-monitor-check branch from f018b53 to 931ed9e Compare December 19, 2025 09:47
@thomas-lebeau thomas-lebeau marked this pull request as ready for review December 19, 2025 13:53
@thomas-lebeau thomas-lebeau requested a review from a team as a code owner December 19, 2025 13:53
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 108 to 110
const data = (await response.json()) as QueryResult

return data.data.buckets
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: add a bit of runtime check to ensure the returned JSON is the expected one, and avoid failing silently if it doesn't match the typescript type.

if (!data || !data.data || !Array.isArray(data.data.buckets) || !data.data.buckets.every(...)) {
  throw new Error("Unexpected response from the API: ${JSON.stringify(data)}")
}

runMain(async () => {
if (checkMonitors) {
command`node ./scripts/deploy/check-monitors.ts ${uploadPath}`.withLogs().run()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: checking monitors before deploy is pointless since check-monitors only care about the new version being deployed. Maybe we could remove this, or make the query broader

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: check-monitors does not check monitors anymore. Maybe rename the script, or since it's only used in deploy-prod-dc, you could just change it to a lib function, not a script.

{
name: 'Telemetry errors on specific org',
query: BASE_QUERY,
facet: '@org_id',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 suggestion: ‏using something like groupBy in the config instead of facet could clarify the purpose.

@@ -0,0 +1,171 @@
/**
* Check monitors status
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 suggestion: ‏There are various "check monitors" / "check telemetry monitors" / "gateMonitors" references left, a search and replace could leave things a cleaner state

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants