-
Notifications
You must be signed in to change notification settings - Fork 18
Description
Is your feature request related to a problem? Yes
Problem
Necessity of P2P Scenarios
A well-known challenge in the Kubernetes ecosystem is that every node in a cluster must pull the same container image, which leads to excessive registry load, throttling risk, and slower pod startup times. Peerd already addresses this by modifying each node’s containerd configuration to prioritize peer-to-peer (P2P) transfers before falling back to the upstream registry.
Necessity of Custom Mirroring Scenarios
However, a broader and increasingly common set of challenges—particularly in multi-cloud and large-scale environments—involves managing per-registry-host image pull routing logic. These challenges often expose friction between Application Developer and Infrastructure Engineer personas. Below are four scenarios that commonly arise:
- Single workload, multi-cloud
- Pain today: Application Developers must edit image references (e.g.,
my1.azurecr.io,xxx.dkr.ecr...) or parameterize them in Helm charts and Kubernetes YAMLs for each target cloud. - Desired outcome: Infrastructure Engineers can remap registry hosts at the infrastructure level using a single P2P-aware tool, with zero changes required to application manifests.
- Pain today: Application Developers must edit image references (e.g.,
- Onboarding existing workloads to a new cloud
- Pain today: Application Developers must re-author every chart or YAML to reflect cloud-specific registries, or they must refactor and parameterize existing files to make them cloud-agnostic.
- Desired outcome: Infrastructure Engineers use a single P2P-aware tool to remap registry hosts and customize image pull routing logic, requiring zero changes to application deployment manifests.
- Multi-registry fail-over
- Pain today: Application Developers must modify deployment manifests during an outage or upstream registry unavailability.
- Desired outcome: Infrastructure Engineers can use a single P2P-aware tool to map registry hosts to an ordered list of additional registry hosts to pull images. Application Developers are not involved in any infrastructure or registry-related outages.
- Load-balancing pulls in large clusters
- Pain today: Application Developers must fragment workloads into separate deployments with different image sources to avoid throttling a single registry.
- Desired outcome: Infrastructure Engineers use a single P2P-aware tool to define node-level mirroring logic (e.g., by node pool or nodeSelector), distributing image pulls across different registry hosts. Application teams deploy once; Peerd handles the rest.
Lack of Unified Tooling for Containerd Mirror Configs
Currently, there is no elegant or unified solution to these problems without directly editing containerd configuration on each node. At best, Infrastructure Engineers must manage multiple tools—one for P2P (e.g., Peerd), and another tool for remapping and failover logic—leading to tool conflicts. At worst, Application Developers are pulled into infrastructure-level concerns, eroding the clean separation of responsibilities and slowing development velocity.
What solution do you propose?
Proposal
Unify P2P and Custom Mirroring in a Single Tool
This proposal introduces first-class support for containerd mirror overrides in Peerd, enabling Helm-driven, per-registry-host image pull routing, customizable at the node or node pool level in a Kubernetes cluster. It unifies modifying containerd for two purposes—peer-to-peer distribution and registry mirror routing—into a single, extensible, and cloud-agnostic solution that is deployable on any Kubernetes cluster, whether in the cloud or on-premises.
Auth-aware Experience
Crucially, the proposal also extends Peerd’s functionality to support containerd mirror configurations without introducing additional burden on users to manage authentication or secrets. Peerd should be able to "just work" on clusters that already have valid authentication configured—whether that’s via ImagePullSecrets, dockerconfigjson-backed secrets, or node-level identities such as:
- AKS: kubelet identity with role assignment to Azure Container Registry
- EKS: node IAM roles granting access to Amazon ECR
- GKE: node service account permissions for Artifact Registry
Peerd should be able to detect and reuse these existing credentials—without requiring application or infrastructure teams to redefine or duplicate them—ensuring seamless image pull authentication across all mirror endpoints.
Proposed YAML Schema
To enable this functionality, Peerd should support the following values.yaml structure, allowing both P2P and custom mirror routing logic to be applied when deploying Peerd using a standard Helm command like helm install peerd <default-params> -f values.yaml. The YAML syntax and definitions follow the YAML block below in the next section.
peerd:
mirrorOverrides:
- host: docker.io
orderedRoutingLogic: [ "P2P", "originalHost", "mirrorEndpoints" ]
mirrorEndpoints:
mirrorSequence: sequential
mirrorHosts:
- host: my.azurecr.io
capabilities: ["pull", "resolve", "push"]
- host: aws_account_id.dkr.ecr.region.amazonaws.com
capabilities: ["pull", "resolve", "push"]
- host: onprem.company.com
orderedRoutingLogic: ...
mirrorEndpoints: ...YAML Schema Definition
All custom mirror overrides per registry host must be defined under mirrorOverrides. Under mirrorOverrides is a list that is used to override the image pull routing for each registry host.
host
- This is required. It specifies the registry host being overridden with custom image pull routing logic. The value must be just the host (such as
docker.io,ghcr.io, ormy.azurecr.io). Do not include repository paths likedocker.io/library/nginx.
orderedRoutingLogic
- This is optional. It defines the exact order in which Peerd should apply its routing logic when pulling an image from this host. The list can include any combination of the following values:
"P2P","originalHost", and"mirrorEndpoints". The default (if not specified) is to try all of them in the order: P2P first, then the original registry host, then mirror hosts. Each entry is attempted in the listed order, and fallback to the next only happens if the current one fails.
mirrorEndpoints
-
This is optional. It defines a set of alternate registry mirrors to use if the original registry fails or is bypassed, depending on your routing logic. This section contains two subfields:
mirrorSequenceandmirrorHosts.-
mirrorSequence- This is required if the parent
mirrorEndpointsfield is defined. It defines how Peerd will walk through the mirror hosts. For now, only"sequential"is supported, which means Peerd will try the mirror hosts one by one in the order they are listed.
- This is required if the parent
-
mirrorHosts-
This field is under
mirrorEndpoints. It contains a list ofhostsunderneath. It defines the actual mirror registry hosts to try pulling from. Each mirror host can include several fields: -
host-
This is required. It is the hostname of the mirror registry (like
my.company.com,my.azurecr.io, oraws_account_id.dkr.ecr.region.amazonaws.com). -
imagePullSecrets-
This is optional. You can specify one or more Kubernetes secrets of
kind: Secretto authenticate with the mirror registry. Each secret should have anamefield. -
name- This is required if the parent
imagePullSecretsis defined for this mirror. This is the name of thekind: Secretresource that contains the registry credentials for authenticating with the mirror registry.
- This is required if the parent
-
-
useNodeIdentity- This is optional. If set to
true, Peerd will assume that the node can authenticate to the mirror registry using its cloud-native identity. For example, this might be an Azure-managed identity in AKS, an IAM role in EKS, or a service account in GKE. - If both
imagePullSecretsanduseNodeIdentityare set, Peerd will not accept the definition.
- This is optional. If set to
-
capabilities- This is required. It defines what the mirror is trusted to do. Valid options include:
"pull": The mirror can serve content for pull operations."resolve": The mirror is trusted to convert tags to digests (this should only be enabled for trusted mirrors, since resolving tags can affect image integrity)."push": The mirror can accept pushed content (only relevant if pushing from nodes through Peerd becomes supported).
- This is required. It defines what the mirror is trusted to do. Valid options include:
-
-
-
What alternatives have you considered?
Alternatives Considered
Alternatives using Containerd hosts.toml Overrides
Currently, there is no Kubernetes-native way to apply custom image pull routing logic without directly modifying containerd configuration files on each node. While containerd supports per-host hosts.toml mirror overrides, managing these configurations manually—or via DaemonSets or init containers—is operationally fragile, hard to standardize, and poorly integrated with Helm-based or GitOps workflows.
Some organizations work around this by baking static hosts.toml files into custom node images, which creates cloud vendor lock-in and slows down iteration, as every registry change requires node pool rebuilds. Others patch containerd at runtime using systemd units or bootstrap scripts, but these lack portability, observability, and centralized management.
Moreover, none of these approaches support peer-to-peer (P2P) distribution. They are limited to static mirroring logic and cannot leverage cluster-local optimizations.
Alternatives using Peerd Today
While Peerd already integrates with containerd’s mirroring capabilities, it does not currently support custom mirror routing logic per registry host. Organizations needing both P2P and custom mirroring are forced to layer Peerd on top of additional tooling—such as DaemonSets or node image modifications to configure containerd manually. This creates risk of misconfiguration or conflicts between Peerd and other low-level tooling that touches containerd. It also makes cluster-wide image routing harder to reason about and automate with multiple conflicting tools and workflows.
Any additional context?
No response
Are you willing to submit PRs to contribute to this feature?
- Yes, I am willing to implement it.