Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Ignore all files
*
Comment on lines +1 to +2
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This .dockerignore ignores the entire build context (*). That pattern prevents using COPY/ADD for the repo (and contributes to the Dockerfile needing to git clone), and it can be surprising to users who expect docker build to package the current working tree. Consider switching to a selective ignore list (e.g., outputs, caches, datasets) instead of ignoring everything.

Suggested change
# Ignore all files
*
# Ignore unnecessary files and directories in the Docker build context
# Version control
.git
.gitignore
.gitattributes
# Node / JavaScript dependencies and builds
node_modules
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-lock.yaml
package-lock.json
# Python caches and environments
__pycache__/
*.py[cod]
*.pyo
.pytest_cache/
.mypy_cache/
.venv/
venv/
# Build / distribution directories (common conventions)
dist/
build/
out/
target/
.next/
.nuxt/
# Logs and temporary files
*.log
*.tmp
*.swp
*.swo
# IDE / editor settings
.vscode/
.idea/
*.iml
# OS-specific junk files
.DS_Store
Thumbs.db
# Docker-related files that shouldn't be copied into images
docker-compose*.yml
Dockerfile*
# Add project-specific large or local-only directories below as needed
# e.g., data/, datasets/, .cache/, etc.

Copilot uses AI. Check for mistakes.
49 changes: 49 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04
ENV DEBIAN_FRONTEND=noninteractive

# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
wget \
git \
bzip2 \
libglib2.0-0 \
libxext6 \
libsm6 \
libxrender1 \
libgl1 \
python3 \
python3-pip \
Comment on lines +14 to +15
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python3/python3-pip are installed via apt, but the image then installs and uses conda’s Python. If there isn’t a runtime need for the system Python, removing these packages would reduce image size and attack surface.

Suggested change
python3 \
python3-pip \

Copilot uses AI. Check for mistakes.
build-essential \
cmake \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*

# Install Miniforge
ENV CONDA_DIR=/opt/conda
RUN wget --quiet https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh -O ~/miniforge.sh && \
/bin/bash ~/miniforge.sh -b -p $CONDA_DIR && \
rm ~/miniforge.sh
Comment on lines +23 to +25
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wget command downloads and executes the Miniforge installer script from a mutable latest URL without any integrity or authenticity verification. If an attacker compromises the conda-forge/miniforge release channel or the network path, they can inject arbitrary code that will be executed at build time with full privileges. Pin the download to a specific, expected release artifact and verify its checksum or signature before execution to prevent supply-chain compromise.

Copilot uses AI. Check for mistakes.

# Add conda to PATH
ENV PATH=${CONDA_DIR}/bin:${PATH}

# Create MAtCha directory
WORKDIR /workspace/MAtCha
RUN chmod -R 777 /workspace/MAtCha
RUN git clone https://github.com/Anttwo/MAtCha.git .
Comment on lines +32 to +33
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The image is built from a fresh git clone of https://github.com/Anttwo/MAtCha.git, while the build context is effectively unused. This means docker build from this repo/branch will not include the code being reviewed (and will always build whatever is currently on the remote default branch), which breaks reproducibility and makes PR changes ineffective. Prefer copying the local checkout into the image (e.g., COPY . .) and remove the git clone step (optionally allow an ARG for building a specific ref).

Suggested change
RUN chmod -R 777 /workspace/MAtCha
RUN git clone https://github.com/Anttwo/MAtCha.git .
COPY . .
RUN chmod -R 777 /workspace/MAtCha

Copilot uses AI. Check for mistakes.
Comment on lines +32 to +33
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chmod -R 777 grants world-writable permissions inside the image and is generally unsafe/unnecessary (especially as the container runs as root by default). Prefer creating a dedicated user and using chown/more restrictive permissions for /workspace/MAtCha.

Suggested change
RUN chmod -R 777 /workspace/MAtCha
RUN git clone https://github.com/Anttwo/MAtCha.git .
RUN git clone https://github.com/Anttwo/MAtCha.git .
RUN useradd -m -s /bin/bash matcha && chown -R matcha:matcha /workspace/MAtCha && chmod -R 755 /workspace/MAtCha

Copilot uses AI. Check for mistakes.

# Set environment variables RTX A6000, L40, A100
ENV CUDA_HOME=/usr/local/cuda
ENV PATH=${CUDA_HOME}/bin:${PATH}
ENV LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}
ENV CPATH=/usr/local/cuda/include
ENV TORCH_CUDA_ARCH_LIST="8.0 8.6 8.9+PTX"

# Activate environment and install dependencies
RUN python install.py --env_name matcha
RUN python download_checkpoints.py
Comment on lines +42 to +44
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

install.py currently uses os.system(...) without checking return codes, so the script can exit 0 even if environment creation/build steps fail (which would allow docker build to succeed with a broken image). Consider updating install.py to fail fast on non-zero exit codes (e.g., subprocess.run(..., check=True)) or replace this with explicit shell commands in the Dockerfile that will fail the build on error.

Suggested change
# Activate environment and install dependencies
RUN python install.py --env_name matcha
RUN python download_checkpoints.py
# Activate environment, install dependencies, and download checkpoints (fail fast on error)
RUN set -euo pipefail && \
python install.py --env_name matcha && \
python download_checkpoints.py

Copilot uses AI. Check for mistakes.

Comment on lines +44 to +45
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Downloading pretrained checkpoints during docker build makes the image much larger and forces network access at build time. Consider gating this behind a build ARG (default off) or moving checkpoint download to runtime with a volume-mounted cache, so users can choose between a lean image and a fully bundled one.

Suggested change
RUN python download_checkpoints.py
# Optionally download pretrained checkpoints at build time.
# To skip this step and keep the image lean, build with:
# docker build --build-arg DOWNLOAD_CHECKPOINTS=false ...
ARG DOWNLOAD_CHECKPOINTS=true
RUN if [ "$DOWNLOAD_CHECKPOINTS" = "true" ]; then python download_checkpoints.py; else echo "Skipping checkpoint download at build time."; fi

Copilot uses AI. Check for mistakes.
ENTRYPOINT ["conda", "run", "--no-capture-output", "-n", "matcha", "/bin/bash", "-c"]
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ENTRYPOINT uses bash -c, which only works correctly when the user passes a single shell string. Common Docker usage like docker run image python train.py ... will be interpreted by bash as -c "python" (dropping the remaining args), so the command won’t run as intended. Prefer an entrypoint that executes "$@" (or set ENTRYPOINT to conda run ... -n matcha without bash -c) so arguments are preserved.

Suggested change
ENTRYPOINT ["conda", "run", "--no-capture-output", "-n", "matcha", "/bin/bash", "-c"]
ENTRYPOINT ["conda", "run", "--no-capture-output", "-n", "matcha"]

Copilot uses AI. Check for mistakes.

# Default command
CMD ["/bin/bash"]
Loading