Merged
Conversation
328826caa updated the balancer's discovery channel to prevent backing up into the discovery stream by dropping the discovery stream. This results in balancers becoming permanently stale (should they ever be used again). This change modifies the discovery stream so that these errors are fatal for the balancer. These errors are recorded distinctly by the error counters. To fix this, we replace the `DiscoverNew` module with a `discover::NewServices` module that wraps the buffering layer. The buffer now only holds target metadata, and services are only built as the entry is dequeued from channel. This has the (positive) side-effect that the proxy's stack_create_total metric will not be incremented before the balancer actually uses an endpoint stack. Previously, this metric would be incremented for all queued endpoint updates. We also now log at INFO the address of all additions and removals from a balancer. This should dramatically improve diagnostics in stale endpoint situations. --- * build(deps): bump DavidAnson/markdownlint-cli2-action (linkerd/linkerd2-proxy#2460) * build(deps): bump tj-actions/changed-files from 36.2.1 to 39.0.2 (linkerd/linkerd2-proxy#2468) * build(deps): bump EmbarkStudios/cargo-deny-action from 1.5.0 to 1.5.4 (linkerd/linkerd2-proxy#2448) * meshtls: log errors parsing client certs (linkerd/linkerd2-proxy#2467) * build(deps): bump actions/checkout from 3.5.0 to 4.1.0 (linkerd/linkerd2-proxy#2474) * build(deps): bump tj-actions/changed-files from 39.0.2 to 39.2.0 (linkerd/linkerd2-proxy#2475) * build(deps): bump EmbarkStudios/cargo-deny-action from 1.5.4 to 1.5.5 (linkerd/linkerd2-proxy#2478) * build(deps): bump DavidAnson/markdownlint-cli2-action (linkerd/linkerd2-proxy#2476) * build(deps): bump actions/upload-artifact from 3.1.2 to 3.1.3 (linkerd/linkerd2-proxy#2479) * Render grpc_status metric label as number (linkerd/linkerd2-proxy#2480) * balance: Log and fail stuck discovery streams. (linkerd/linkerd2-proxy#2484) * build(deps): update `rustix` to v0.36.16/v0.37.7 (linkerd/linkerd2-proxy#2488) * balance: Fail the discovery stream on queue backup (linkerd/linkerd2-proxy#2486) Signed-off-by: Oliver Gould <ver@buoyant.io>
hawkw
approved these changes
Oct 19, 2023
alpeb
approved these changes
Oct 19, 2023
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
328826caa updated the balancer's discovery channel to prevent backing up into the discovery stream by dropping the discovery stream. This results in balancers becoming permanently stale (should they ever be used again).
This change modifies the discovery stream so that these errors are fatal for the balancer. These errors are recorded distinctly by the error counters.
To fix this, we replace the
DiscoverNewmodule with adiscover::NewServicesmodule that wraps the buffering layer. The buffer now only holds target metadata, and services are only built as the entry is dequeued from channel.This has the (positive) side-effect that the proxy's stack_create_total metric will not be incremented before the balancer actually uses an endpoint stack. Previously, this metric would be incremented for all queued endpoint updates.
We also now log at INFO the address of all additions and removals from a balancer. This should dramatically improve diagnostics in stale endpoint situations.
rustixto v0.36.16/v0.37.7 (build(deps): updaterustixto v0.36.16/v0.37.7 linkerd2-proxy#2488)