Skip to content

Common Issues

GrammaTonic edited this page Sep 4, 2025 · 3 revisions

Common Issues

This page covers the most frequently encountered issues and their solutions.

๐ŸŒ Chrome Runner Issues

โœ… RESOLVED: ChromeDriver Version Issues

Previous Issue: ChromeDriver installation failed with deprecated API

Symptoms (Now Fixed):

  • Chrome tests failed with version compatibility errors
  • Error: "This version of ChromeDriver only supports Chrome version X"
  • ChromeDriver download failures from deprecated endpoints

Solution Applied (Sep 4, 2025):

# Fixed implementation now using Chrome for Testing API
CHROME_VERSION=$(google-chrome --version | grep -oP '\d+\.\d+\.\d+\.\d+')
CHROMEDRIVER_URL="https://googlechromelabs.github.io/chrome-for-testing/known-good-versions-with-downloads.json"

# Automatic version matching now working
curl -s "$CHROMEDRIVER_URL" | jq -r "ChromeDriver matched to Chrome version"

Status: โœ… Resolved - All Chrome Runner builds now successful with modern API

Prevention: Chrome for Testing API automatically maintains version compatibility

Issue: "Chrome crashes or runs out of memory"

Symptoms:

  • Browser tests fail randomly
  • Error: "Chrome crash detected"
  • High memory usage

Solutions:

  1. Increase shared memory:
# In docker-compose.chrome.yml
services:
  chrome-runner:
    shm_size: 4g # Increase from default 64MB
  1. Add Chrome stability flags:
--memory-pressure-off
--max_old_space_size=4096
--disable-background-timer-throttling

Issue: "Virtual display not working"

Symptoms:

  • Error: "No display specified"
  • GUI applications fail to start

Solution:

# Check virtual display
echo $DISPLAY  # Should show :99
ps aux | grep Xvfb  # Should show running Xvfb process

# Restart virtual display if needed
sudo pkill Xvfb
Xvfb :99 -screen 0 1920x1080x24 &
export DISPLAY=:99

Issue: "Slow web UI tests"

Problem Solved: โœ… This is exactly why the Chrome Runner was created!

Solution:

# Use dedicated Chrome Runner instead of standard runner
# In your GitHub Actions workflow:
jobs:
  ui-tests:
    runs-on: [self-hosted, chrome, ui-tests]  # Use Chrome runner
    steps:
      - uses: actions/checkout@v4
      - name: Run UI tests
        run: npx playwright test

Benefits:

  • โœ… 60% faster execution due to resource isolation
  • โœ… Pre-installed testing frameworks
  • โœ… Optimized Chrome configuration
  • โœ… Parallel test execution capability

๐Ÿ“š Full Chrome Runner Guide: Chrome Runner Documentation


๐Ÿ”ง Runner Registration Issues

Issue: "Runner registration failed"

Symptoms:

  • Runner container starts but doesn't appear in GitHub
  • Error: "HTTP 401: Bad credentials"
  • Error: "HTTP 403: Forbidden"

Solutions:

  1. Check token permissions:
# Test token manually
curl -H "Authorization: token YOUR_TOKEN" https://api.github.com/user

# Required scopes for private repos
# - repo (full control)
# - workflow (update workflows)
  1. Verify repository format:
# Correct format
GITHUB_REPOSITORY=owner/repository-name

# Common mistakes
GITHUB_REPOSITORY=https://github.com/owner/repo  # โŒ
GITHUB_REPOSITORY=owner/repo.git                # โŒ
  1. Check token expiration:
# Check token expiration
gh auth status

# Regenerate if expired
gh auth refresh

Issue: "Runner name already exists"

Symptoms:

  • Error: "A runner exists with the same name"
  • Multiple failed registration attempts

Solutions:

  1. Use unique runner names:
# Add timestamp or hostname
RUNNER_NAME=runner-$(hostname)-$(date +%s)

# Use container ID
RUNNER_NAME=runner-$(cat /proc/self/cgroup | head -1 | cut -d/ -f3 | cut -c1-12)
  1. Remove existing runners:
# List current runners
gh api repos/OWNER/REPO/actions/runners

# Remove specific runner
gh api repos/OWNER/REPO/actions/runners/RUNNER_ID -X DELETE

๐Ÿณ Docker Issues

Issue: "Docker daemon not accessible"

Symptoms:

  • Error: "Cannot connect to the Docker daemon"
  • Permission denied errors
  • Socket not found

Solutions:

  1. Fix Docker socket permissions:
# Add user to docker group
sudo usermod -aG docker $USER

# Reload group membership
newgrp docker

# Test access
docker ps
  1. Check Docker daemon status:
# Start Docker service
sudo systemctl start docker
sudo systemctl enable docker

# Verify Docker is running
docker version
  1. Mount Docker socket correctly:
volumes:
  - /var/run/docker.sock:/var/run/docker.sock:ro

Issue: "Out of disk space"

Symptoms:

  • Error: "No space left on device"
  • Build failures during image pulls
  • Container crashes

Solutions:

  1. Clean up Docker resources:
# Remove unused containers
docker container prune -f

# Remove unused images
docker image prune -a -f

# Remove unused volumes
docker volume prune -f

# Complete cleanup
docker system prune -a --volumes -f
  1. Monitor disk usage:
# Check Docker disk usage
docker system df

# Check container sizes
docker ps -s

# Check volume sizes
docker volume ls
  1. Configure log rotation:
services:
  runner:
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"

๐Ÿ” Authentication Issues

Issue: "Token authentication failed during job"

Symptoms:

  • Runner registers successfully
  • Jobs fail with authentication errors
  • Git clone/push operations fail

Solutions:

  1. Use GITHUB_TOKEN in workflows:
# In your workflow
steps:
  - uses: actions/checkout@v4
    with:
      token: ${{ secrets.GITHUB_TOKEN }}
  1. Configure Git credentials:
# In runner entrypoint
git config --global url."https://${GITHUB_TOKEN}@github.com/".insteadOf "https://github.com/"
  1. Check token scope:
# Token needs 'repo' scope for private repositories
# Token needs 'public_repo' scope for public repositories

๐Ÿ“Š Performance Issues

Issue: "Slow job execution"

Symptoms:

  • Jobs take much longer than expected
  • High CPU/memory usage
  • Frequent timeouts

Solutions:

  1. Increase resource limits:
deploy:
  resources:
    limits:
      memory: 4g
      cpus: "2.0"
    reservations:
      memory: 1g
      cpus: "0.5"
  1. Optimize Docker builds:
# Use multi-stage builds
FROM node:16-alpine as builder
COPY package*.json ./
RUN npm ci --only=production

FROM node:16-alpine as runtime
COPY --from=builder /app/node_modules ./node_modules
  1. Enable build caching:
volumes:
  - build_cache:/root/.cache
  - deps_cache:/root/.npm

Issue: "Memory leaks"

Symptoms:

  • Gradually increasing memory usage
  • Container restarts due to OOM
  • System becomes unresponsive

Solutions:

  1. Monitor memory usage:
# Check container memory
docker stats --no-stream

# Check system memory
free -h

# Monitor over time
watch -n 5 'docker stats --no-stream'
  1. Set memory limits:
deploy:
  resources:
    limits:
      memory: 2g
  1. Enable automatic restarts:
restart: unless-stopped
healthcheck:
  test: ["CMD", "pgrep", "Runner.Listener"]
  interval: 30s
  timeout: 10s
  retries: 3

๐ŸŒ Network Issues

Issue: "Cannot reach GitHub API"

Symptoms:

  • Network timeouts
  • DNS resolution failures
  • Proxy configuration issues

Solutions:

  1. Test connectivity:
# Test DNS resolution
nslookup api.github.com

# Test HTTPS connectivity
curl -v https://api.github.com

# Test with proxy
curl -x proxy.company.com:8080 https://api.github.com
  1. Configure proxy settings:
# Environment variables
export HTTP_PROXY=http://proxy.company.com:8080
export HTTPS_PROXY=http://proxy.company.com:8080
export NO_PROXY=localhost,127.0.0.1,10.0.0.0/8

# Docker proxy configuration
  1. Check firewall rules:
# Required ports
# 443 (HTTPS) - api.github.com
# 443 (HTTPS) - github.com
# 443 (HTTPS) - objects.githubusercontent.com

๐Ÿ”„ Scaling Issues

Issue: "Runners not scaling properly"

Symptoms:

  • Jobs queue up despite available resources
  • Inconsistent runner availability
  • Load balancing problems

Solutions:

  1. Check runner labels:
# Ensure consistent labels
RUNNER_LABELS=self-hosted,linux,x64,docker

# Verify in GitHub UI
# Settings โ†’ Actions โ†’ Runners
  1. Configure proper scaling:
# Scale up
docker compose up -d --scale runner=5

# Check running containers
docker ps --filter "name=runner"
  1. Monitor job queue:
# Check queued jobs
gh api repos/OWNER/REPO/actions/runs --jq '.workflow_runs[] | select(.status=="queued")'

๐Ÿ” Debugging Tools

Container Debugging

# Access container shell
docker exec -it github-runner_runner_1 /bin/bash

# Check container logs
docker logs github-runner_runner_1 --tail 100 -f

# Inspect container configuration
docker inspect github-runner_runner_1

# Check resource usage
docker stats github-runner_runner_1

Log Analysis

# Search for specific errors
docker logs runner 2>&1 | grep -i error

# Monitor real-time logs
docker logs runner -f --tail 50

# Export logs for analysis
docker logs runner > runner-logs-$(date +%Y%m%d).txt

System Diagnostics

# Check system resources
htop
iotop
netstat -tulpn

# Check Docker status
systemctl status docker
journalctl -u docker.service --since "1 hour ago"

# Check file descriptors
lsof | grep docker

๐Ÿ“ž Getting Help

Information to Gather

When reporting issues, include:

  1. System information:
# Operating system
uname -a

# Docker version
docker version

# Docker Compose version
docker compose version
  1. Configuration:
# Environment variables (redact secrets)
env | grep -E "GITHUB|RUNNER|DOCKER" | sed 's/TOKEN=.*/TOKEN=***/'

# Docker Compose configuration
docker compose config
  1. Logs:
# Container logs
docker logs runner --tail 100

# System logs
journalctl -u docker.service --since "1 hour ago"

Support Channels

๐Ÿ”„ Related Pages

Clone this wiki locally