Skip to content

Enable containerd mirror overrides per registry host in Peerd to support custom image pull routing logic #115

@johnsonshi

Description

@johnsonshi

Is your feature request related to a problem? Yes

Problem

Necessity of P2P Scenarios

A well-known challenge in the Kubernetes ecosystem is that every node in a cluster must pull the same container image, which leads to excessive registry load, throttling risk, and slower pod startup times. Peerd already addresses this by modifying each node’s containerd configuration to prioritize peer-to-peer (P2P) transfers before falling back to the upstream registry.

Necessity of Custom Mirroring Scenarios

However, a broader and increasingly common set of challenges—particularly in multi-cloud and large-scale environments—involves managing per-registry-host image pull routing logic. These challenges often expose friction between Application Developer and Infrastructure Engineer personas. Below are four scenarios that commonly arise:

  1. Single workload, multi-cloud
    • Pain today: Application Developers must edit image references (e.g., my1.azurecr.io, xxx.dkr.ecr...) or parameterize them in Helm charts and Kubernetes YAMLs for each target cloud.
    • Desired outcome: Infrastructure Engineers can remap registry hosts at the infrastructure level using a single P2P-aware tool, with zero changes required to application manifests.
  2. Onboarding existing workloads to a new cloud
    • Pain today: Application Developers must re-author every chart or YAML to reflect cloud-specific registries, or they must refactor and parameterize existing files to make them cloud-agnostic.
    • Desired outcome: Infrastructure Engineers use a single P2P-aware tool to remap registry hosts and customize image pull routing logic, requiring zero changes to application deployment manifests.
  3. Multi-registry fail-over
    • Pain today: Application Developers must modify deployment manifests during an outage or upstream registry unavailability.
    • Desired outcome: Infrastructure Engineers can use a single P2P-aware tool to map registry hosts to an ordered list of additional registry hosts to pull images. Application Developers are not involved in any infrastructure or registry-related outages.
  4. Load-balancing pulls in large clusters
    • Pain today: Application Developers must fragment workloads into separate deployments with different image sources to avoid throttling a single registry.
    • Desired outcome: Infrastructure Engineers use a single P2P-aware tool to define node-level mirroring logic (e.g., by node pool or nodeSelector), distributing image pulls across different registry hosts. Application teams deploy once; Peerd handles the rest.

Lack of Unified Tooling for Containerd Mirror Configs

Currently, there is no elegant or unified solution to these problems without directly editing containerd configuration on each node. At best, Infrastructure Engineers must manage multiple tools—one for P2P (e.g., Peerd), and another tool for remapping and failover logic—leading to tool conflicts. At worst, Application Developers are pulled into infrastructure-level concerns, eroding the clean separation of responsibilities and slowing development velocity.

What solution do you propose?

Proposal

Unify P2P and Custom Mirroring in a Single Tool

This proposal introduces first-class support for containerd mirror overrides in Peerd, enabling Helm-driven, per-registry-host image pull routing, customizable at the node or node pool level in a Kubernetes cluster. It unifies modifying containerd for two purposes—peer-to-peer distribution and registry mirror routing—into a single, extensible, and cloud-agnostic solution that is deployable on any Kubernetes cluster, whether in the cloud or on-premises.

Auth-aware Experience

Crucially, the proposal also extends Peerd’s functionality to support containerd mirror configurations without introducing additional burden on users to manage authentication or secrets. Peerd should be able to "just work" on clusters that already have valid authentication configured—whether that’s via ImagePullSecrets, dockerconfigjson-backed secrets, or node-level identities such as:

  • AKS: kubelet identity with role assignment to Azure Container Registry
  • EKS: node IAM roles granting access to Amazon ECR
  • GKE: node service account permissions for Artifact Registry

Peerd should be able to detect and reuse these existing credentials—without requiring application or infrastructure teams to redefine or duplicate them—ensuring seamless image pull authentication across all mirror endpoints.

Proposed YAML Schema

To enable this functionality, Peerd should support the following values.yaml structure, allowing both P2P and custom mirror routing logic to be applied when deploying Peerd using a standard Helm command like helm install peerd <default-params> -f values.yaml. The YAML syntax and definitions follow the YAML block below in the next section.

peerd:
  mirrorOverrides:
    - host: docker.io
      orderedRoutingLogic: [ "P2P", "originalHost", "mirrorEndpoints" ]
      mirrorEndpoints:
        mirrorSequence: sequential
        mirrorHosts:
          - host: my.azurecr.io
            capabilities: ["pull", "resolve", "push"]
          - host: aws_account_id.dkr.ecr.region.amazonaws.com
            capabilities: ["pull", "resolve", "push"]
    - host: onprem.company.com
       orderedRoutingLogic: ...
       mirrorEndpoints: ...

YAML Schema Definition

All custom mirror overrides per registry host must be defined under mirrorOverrides. Under mirrorOverrides is a list that is used to override the image pull routing for each registry host.

host

  • This is required. It specifies the registry host being overridden with custom image pull routing logic. The value must be just the host (such as docker.io, ghcr.io, or my.azurecr.io). Do not include repository paths like docker.io/library/nginx.

orderedRoutingLogic

  • This is optional. It defines the exact order in which Peerd should apply its routing logic when pulling an image from this host. The list can include any combination of the following values: "P2P", "originalHost", and "mirrorEndpoints". The default (if not specified) is to try all of them in the order: P2P first, then the original registry host, then mirror hosts. Each entry is attempted in the listed order, and fallback to the next only happens if the current one fails.

mirrorEndpoints

  • This is optional. It defines a set of alternate registry mirrors to use if the original registry fails or is bypassed, depending on your routing logic. This section contains two subfields: mirrorSequence and mirrorHosts.

    • mirrorSequence

      • This is required if the parent mirrorEndpoints field is defined. It defines how Peerd will walk through the mirror hosts. For now, only "sequential" is supported, which means Peerd will try the mirror hosts one by one in the order they are listed.
    • mirrorHosts

      • This field is under mirrorEndpoints. It contains a list of hosts underneath. It defines the actual mirror registry hosts to try pulling from. Each mirror host can include several fields:

      • host

        • This is required. It is the hostname of the mirror registry (like my.company.com, my.azurecr.io, or aws_account_id.dkr.ecr.region.amazonaws.com).

        • imagePullSecrets

          • This is optional. You can specify one or more Kubernetes secrets of kind: Secret to authenticate with the mirror registry. Each secret should have a name field.

          • name

            • This is required if the parent imagePullSecrets is defined for this mirror. This is the name of the kind: Secret resource that contains the registry credentials for authenticating with the mirror registry.
        • useNodeIdentity

          • This is optional. If set to true, Peerd will assume that the node can authenticate to the mirror registry using its cloud-native identity. For example, this might be an Azure-managed identity in AKS, an IAM role in EKS, or a service account in GKE.
          • If both imagePullSecrets and useNodeIdentity are set, Peerd will not accept the definition.
        • capabilities

          • This is required. It defines what the mirror is trusted to do. Valid options include:
            • "pull": The mirror can serve content for pull operations.
            • "resolve": The mirror is trusted to convert tags to digests (this should only be enabled for trusted mirrors, since resolving tags can affect image integrity).
            • "push": The mirror can accept pushed content (only relevant if pushing from nodes through Peerd becomes supported).

What alternatives have you considered?

Alternatives Considered

Alternatives using Containerd hosts.toml Overrides

Currently, there is no Kubernetes-native way to apply custom image pull routing logic without directly modifying containerd configuration files on each node. While containerd supports per-host hosts.toml mirror overrides, managing these configurations manually—or via DaemonSets or init containers—is operationally fragile, hard to standardize, and poorly integrated with Helm-based or GitOps workflows.

Some organizations work around this by baking static hosts.toml files into custom node images, which creates cloud vendor lock-in and slows down iteration, as every registry change requires node pool rebuilds. Others patch containerd at runtime using systemd units or bootstrap scripts, but these lack portability, observability, and centralized management.

Moreover, none of these approaches support peer-to-peer (P2P) distribution. They are limited to static mirroring logic and cannot leverage cluster-local optimizations.

Alternatives using Peerd Today

While Peerd already integrates with containerd’s mirroring capabilities, it does not currently support custom mirror routing logic per registry host. Organizations needing both P2P and custom mirroring are forced to layer Peerd on top of additional tooling—such as DaemonSets or node image modifications to configure containerd manually. This creates risk of misconfiguration or conflicts between Peerd and other low-level tooling that touches containerd. It also makes cluster-wide image routing harder to reason about and automate with multiple conflicting tools and workflows.

Any additional context?

No response

Are you willing to submit PRs to contribute to this feature?

  • Yes, I am willing to implement it.

Metadata

Metadata

Labels

enhancementNew feature or requesthelp wantedExtra attention is needed

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions