Skip to content

Optimize client Docker image: use runtime base and clean up#263

Open
7174Andy wants to merge 5 commits intomainfrom
andrew/optimize-client-image
Open

Optimize client Docker image: use runtime base and clean up#263
7174Andy wants to merge 5 commits intomainfrom
andrew/optimize-client-image

Conversation

@7174Andy
Copy link
Collaborator

@7174Andy 7174Andy commented Feb 6, 2026

Summary

Reduce client Docker image download size by 2.78 GB (51%) by switching from the CUDA devel base image to runtime, removing duplicate apt packages, and removing a redundant uv sync step.

Changes

Changed

  • Switch base image from nvidia/cuda:12.8.1-cudnn-devel-ubuntu22.04 to nvidia/cuda:12.8.1-cudnn-runtime-ubuntu22.04 in both Dockerfile and Dockerfile.dev — the devel variant includes the full CUDA compiler toolchain (~3GB) which is not needed for running pre-compiled GPU workloads
  • Remove redundant uv sync after uv add in production Dockerfileuv add already performs a sync

Fixed

  • Remove duplicate apt packages that appeared twice in the install list: libfontconfig1, libdbus-1-3, libxkbcommon-x11-0 (already installed later in the same RUN command via the X11 library block)

Measured Impact

Base Image Size (compressed, linux/amd64)

Image Download Size
cudnn-devel (before) 5.45 GB
cudnn-runtime (after) 2.66 GB
Reduction -2.78 GB (-51%)

The savings come from removing the CUDA dev toolchain layer (2.99 GB): nvcc, cuda-command-line-tools, cuda-libraries-dev, static libraries, and profiling tools.

CI Pull Time (GitHub Actions)

Metric Before (main) After (this PR) Delta
Pull client image 1m 50s 1m 14s -36s (33% faster)
Total verify job 2m 12s 1m 34s -38s (29% faster)

Allocator image (unchanged control) showed no timing difference (10-13s), confirming improvements are real.

Technical Details

The devel CUDA image variant includes nvcc, CUDA headers, and static libraries needed for compiling CUDA code. The runtime variant includes only the shared libraries needed to run CUDA applications. Since the client container runs pre-built Python packages (PyTorch wheels, etc.) rather than compiling CUDA kernels, runtime is sufficient.

The duplicate apt packages were harmless (apt handles duplicates gracefully) but added confusion when reviewing the package list.

Testing

  • CI Docker image workflow passes
  • Client console scripts work (subscribe, check_gpu, update_inuse_status)
  • Docker prod image builds successfully
  • CUDA runtime is available inside container (nvidia-smi, PyTorch GPU detection)

🤖 Generated with Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant