Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
c7e62fd
Add coverage audit plan for increasing test coverage
trissim Nov 3, 2025
8f6086b
Phase 1: Increase test coverage for quick-win modules
trissim Nov 3, 2025
1f8eee1
Phase 2: Complete stack_utils and oom_recovery test coverage
trissim Nov 3, 2025
44418c8
Phase 3: Complete test coverage for decorators, dtype_scaling, framew…
trissim Nov 3, 2025
fc90cbc
Fix test failures in Phase 3
trissim Nov 3, 2025
9770a71
Phase 4: Boost test coverage to 88% - Add comprehensive tests for fra…
trissim Nov 3, 2025
1c807b8
feat: add clamping to dtype scaling and enhance GPU tests
trissim Nov 3, 2025
6e029c2
feat: add gpu optional dependencies and update CI workflow
trissim Nov 3, 2025
9f217b2
feat: update CI workflow to use Kaggle action for GPU testing
trissim Nov 3, 2025
c5d14c3
Fix: Resolve CI CuPy import and artifact deprecation issues
trissim Nov 3, 2025
0050b1b
Setup GPU testing with NVIDIA CUDA Docker container for codecov
trissim Nov 3, 2025
a7e1f46
Add GPU testing setup documentation
trissim Nov 3, 2025
75a52bd
Fix GPU runner: remove Docker container, remove torch from main tests
trissim Nov 3, 2025
b99f87d
Add Python 3.13 and improve GPU test documentation
trissim Nov 3, 2025
6dc040d
Fix ruff linting issues
trissim Nov 3, 2025
5003c5c
Fix TensorFlow version checking to not depend on pkg_resources
trissim Nov 3, 2025
1d76067
Fix W293: Remove trailing whitespace on blank line in utils.py
trissim Nov 3, 2025
4bf6cac
Format code with black and add comprehensive test coverage
trissim Nov 3, 2025
c40019e
Add noqa comments for eval-used variables and framework imports
trissim Nov 3, 2025
3118b87
Fix final ruff linting issues
trissim Nov 3, 2025
55d2103
Fix black formatting in stack_utils.py
trissim Nov 3, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
121 changes: 63 additions & 58 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,21 +8,14 @@ on:
workflow_dispatch:

jobs:
# Test across Python versions with CPU-compatible frameworks
# Test across Python versions with CPU-compatible frameworks (no torch)
test:
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
python-version: ["3.10", "3.11", "3.12"]
python-version: ["3.10", "3.11", "3.12", "3.13"]
os: [ubuntu-latest, windows-latest, macos-latest]
framework: [none, torch]
exclude:
# Reduce matrix size - test torch mainly on ubuntu
- os: windows-latest
framework: torch
- os: macos-latest
framework: torch

steps:
- name: Checkout
Expand All @@ -38,28 +31,24 @@ jobs:
python -m pip install --upgrade pip
pip install -e ".[dev]"

- name: Install PyTorch (CPU-only)
if: matrix.framework == 'torch'
run: |
pip install torch --index-url https://download.pytorch.org/whl/cpu

- name: Run tests with coverage
run: |
pytest --cov=arraybridge --cov-report=xml --cov-report=html --cov-report=term-missing -v

- name: Upload coverage to Codecov
if: matrix.os == 'ubuntu-latest' && matrix.python-version == '3.12' && matrix.framework == 'torch'
if: matrix.os == 'ubuntu-latest' && matrix.python-version == '3.12'
uses: codecov/codecov-action@v3
with:
file: ./coverage.xml
fail_ci_if_error: false

# GPU tests with GitHub Actions GPU runners (optional, non-blocking)
# Note: GPU runners may have long queue times, so this job is allowed to fail
# GPU tests - includes framework testing
# Note: GitHub Actions ubuntu-latest doesn't have physical GPU,
# but tests will run the "unavailable GPU" code paths and mock GPU tests
gpu-test:
runs-on: ubuntu-latest-gpu-t4
continue-on-error: true # Don't block PR merges if GPU tests fail or timeout
runs-on: ubuntu-latest
continue-on-error: true # Don't block PR merges if GPU not available

steps:
- name: Checkout
uses: actions/checkout@v4
Expand All @@ -69,57 +58,73 @@ jobs:
with:
python-version: "3.12"

- name: Check CUDA availability
run: |
nvidia-smi
nvcc --version || echo "NVCC not available"

- name: Install base dependencies
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"

- name: Install GPU frameworks
- name: Install GPU frameworks (will use CPU versions in CI)
run: |
# Install PyTorch with CUDA support
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
# PyTorch - CPU version will be installed in CI (no GPU available)
pip install torch torchvision torchaudio 2>&1 || echo "PyTorch install attempted"

# Install CuPy with CUDA 12.x support
pip install cupy-cuda12x
# JAX - CPU version
pip install jax jaxlib 2>&1 || echo "JAX install skipped (optional)"

# Verify installations
python -c "import torch; print(f'PyTorch version: {torch.__version__}'); print(f'CUDA available: {torch.cuda.is_available()}'); print(f'CUDA device: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else \"N/A\"}')"
python -c "import cupy as cp; print(f'CuPy version: {cp.__version__}'); print(f'CUDA device: {cp.cuda.Device()}')"
# CuPy - will fail without CUDA, that's ok
pip install cupy-cuda12x 2>&1 || echo "CuPy skipped (requires actual CUDA)"

- name: Run GPU tests
run: |
# Run all tests - GPU frameworks will be used when available
pytest -v --tb=short

- name: Test GPU memory conversions
- name: Check framework availability
run: |
# Quick sanity check for GPU conversions
python -c "
import numpy as np
import torch
import cupy as cp
from arraybridge import convert_memory, detect_memory_type

# Test NumPy -> CuPy
np_arr = np.array([1, 2, 3], dtype=np.float32)
cp_arr = convert_memory(np_arr, 'numpy', 'cupy', gpu_id=0)
print(f'NumPy -> CuPy: {type(cp_arr)}, device: {cp_arr.device}')
print('=== Framework Availability Check ===')
try:
import torch
print(f'✓ PyTorch available')
print(f' CUDA available: {torch.cuda.is_available()}')
print(f' (This is normal - GitHub Actions has no physical GPU)')
except ImportError:
print('✗ PyTorch not available')

# Test NumPy -> PyTorch GPU
torch_arr = convert_memory(np_arr, 'numpy', 'torch', gpu_id=0)
print(f'NumPy -> PyTorch: {type(torch_arr)}, device: {torch_arr.device}')
try:
import jax
print(f'✓ JAX available')
except ImportError:
print('✗ JAX not available')

# Test CuPy -> PyTorch
torch_from_cp = convert_memory(cp_arr, 'cupy', 'torch', gpu_id=0)
print(f'CuPy -> PyTorch: {type(torch_from_cp)}, device: {torch_from_cp.device}')

print('✓ All GPU conversions successful!')
"
try:
import cupy
print(f'✓ CuPy available')
except ImportError:
print('✗ CuPy not available (normal - requires CUDA)')
" || true

- name: Run GPU cleanup tests
run: |
# Tests include:
# 1. Framework unavailable tests (always run)
# 2. GPU unavailable fallback paths (will run in CI)
# 3. Mocked GPU tests (test cleanup code with mocked GPU state)
pytest -v tests/test_gpu_cleanup.py \
--cov=arraybridge \
--cov-report=term-missing \
--cov-report=html \
--cov-report=xml \
-ra

- name: Upload GPU test coverage to Codecov
uses: codecov/codecov-action@v3
with:
file: ./coverage.xml
flags: gpu-tests
fail_ci_if_error: false

- name: Upload HTML coverage report
if: always()
uses: actions/upload-artifact@v4
with:
name: gpu-test-coverage-report
path: htmlcov/

# Code quality checks
code-quality:
Expand Down
80 changes: 45 additions & 35 deletions .github/workflows/gpu-tests.yml
Original file line number Diff line number Diff line change
@@ -1,21 +1,11 @@
name: GPU Tests (Optional)
name: GPU Tests (Manual - Comprehensive GPU Testing)

on:
workflow_dispatch: # Manual trigger only
schedule:
# Run weekly on Sunday at 2am UTC (optional, can be removed)
- cron: '0 2 * * 0'

jobs:
gpu-test:
runs-on: [self-hosted, gpu] # Requires GPU runner
# Alternative: use GitHub's beta GPU runners when available
# runs-on: ubuntu-latest-gpu

strategy:
fail-fast: false
matrix:
framework: [cupy, torch-gpu]
runs-on: ubuntu-latest

steps:
- name: Checkout
Expand All @@ -24,45 +14,65 @@ jobs:
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: "3.11"

- name: Check CUDA availability
run: |
nvidia-smi || echo "No NVIDIA GPU detected"
nvcc --version || echo "No CUDA compiler detected"
python-version: "3.12"

- name: Install base dependencies
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"

- name: Install CuPy
if: matrix.framework == 'cupy'
- name: Install CPU-available frameworks
run: |
pip install cupy-cuda12x # Adjust CUDA version as needed
# Install CPU versions of frameworks for testing
# (Real GPU tests would need actual CUDA infrastructure)
pip install torch --index-url https://download.pytorch.org/whl/cpu
pip install jax jaxlib
echo "Installed PyTorch (CPU) and JAX for testing"

- name: Install PyTorch (GPU)
if: matrix.framework == 'torch-gpu'
- name: Check framework availability
run: |
pip install torch --index-url https://download.pytorch.org/whl/cu121
python -c "
import sys
frameworks = ['numpy', 'torch', 'jax', 'cupy', 'tensorflow', 'pyclesperanto']
for fw in frameworks:
try:
__import__(fw)
print(f'✓ {fw} available')
except ImportError:
print(f'✗ {fw} not available (will be skipped)')
"

- name: Run GPU-specific tests
- name: Run comprehensive GPU cleanup tests
run: |
# Run only tests marked with @pytest.mark.gpu
pytest -v -m "gpu" --cov=arraybridge --cov-report=term-missing
continue-on-error: true # Don't fail the workflow if GPU tests fail
# Run all GPU cleanup tests
# Tests will use frameworks if available, skip gracefully if not
pytest -v tests/test_gpu_cleanup.py \
--cov=arraybridge \
--cov-report=term-missing \
--cov-report=html \
--tb=short \
-ra

- name: Run framework-specific tests
- name: Run framework-specific GPU tests
run: |
# Run tests marked for specific frameworks
pytest -v tests/ -k "gpu or cupy or torch or tensorflow or jax or pyclesperanto" \
--cov=arraybridge \
--cov-report=term-missing \
-ra || true

- name: Test results summary
if: always()
run: |
# Run tests for the specific framework
pytest -v -m "${{ matrix.framework }}" --cov=arraybridge --cov-report=term-missing
continue-on-error: true
echo "GPU Testing Complete!"
echo "Note: Full GPU testing requires NVIDIA CUDA infrastructure."
echo "For complete GPU testing, use a system with NVIDIA GPUs installed."

- name: Upload test results
- name: Upload coverage report
if: always()
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4
with:
name: gpu-test-results-${{ matrix.framework }}
name: gpu-test-coverage-report
path: |
htmlcov/
.coverage
Expand Down
47 changes: 47 additions & 0 deletions CI_ARTIFACT_UPDATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# GitHub Actions upload-artifact v3 → v4 Migration

## Issue
GitHub deprecated `actions/upload-artifact@v3` effective April 16, 2024. The CI workflows were failing with:
```
Error: This request has been automatically failed because it uses a deprecated version of `actions/upload-artifact: v3`.
```

## Solution
Updated all `upload-artifact` action references from `v3` to `v4` in CI workflows.

## Files Updated

### 1. `.github/workflows/ci.yml`
**Line 91**: Updated GPU test artifact upload
```yaml
# Before
uses: actions/upload-artifact@v3

# After
uses: actions/upload-artifact@v4
```

### 2. `.github/workflows/gpu-tests.yml`
**Line 41**: Updated standalone GPU test artifact upload
```yaml
# Before
uses: actions/upload-artifact@v3

# After
uses: actions/upload-artifact@v4
```

## Changes Made
- ✅ Both workflow files now use `actions/upload-artifact@v4`
- ✅ Artifact upload configuration remains the same
- ✅ CI workflows will no longer fail due to deprecated action

## Reference
- [GitHub Blog: Deprecation Notice for Artifact Actions](https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/)
- [Upload Artifact v4 Documentation](https://github.com/actions/upload-artifact/releases/tag/v4)

## Verification
The GPU test CI job should now:
1. ✅ Run without the deprecation error
2. ✅ Successfully upload test results and coverage reports
3. ✅ Display artifacts in the GitHub Actions UI
Loading
Loading