A Rust implementation of PAR2 (Parity Archive) for data recovery and verification.
par2rs is a modern, high-performance implementation of the PAR2 (Parity Archive 2.0) format written in Rust. PAR2 files are used to detect and repair corruption in data files, making them invaluable for archival storage, data transmission, and backup verification.
par2rs achieves 1.1-2.9x speedup over par2cmdline through:
- Optimized I/O patterns using full slice-size chunks instead of 64KB blocks (eliminates redundant reads)
- Parallel Reed-Solomon reconstruction using Rayon for multi-threaded chunk processing
- SIMD-accelerated operations (PSHUFB on x86_64, NEON on ARM64, portable_simd cross-platform)
- Smart validation skipping for files with matching MD5 checksums
- Memory-efficient lazy loading with LRU caching
Latest benchmark results:
Linux x86_64 (AMD Ryzen 9 5950X, 64GB RAM):
- 1MB: 1.23x speedup (0.032s → 0.026s)
- 10MB: 1.54x speedup (0.074s → 0.048s)
- 100MB: 1.20x speedup (0.386s → 0.321s)
- 1GB: 1.11x speedup (3.74s → 3.37s)
- 10GB: 1.53x speedup (58.80s → 38.32s)
macOS M1 (MacBook Air, 16GB RAM) - OUTDATED (October 2025):
- 100MB: 2.77x speedup (2.26s → 0.81s)
- 1GB: 2.99x speedup (22.7s → 7.6s)
- 10GB: 2.46x speedup (104.8s → 42.6s)
- 25GB: 2.36x speedup (349.6s → 147.8s)
⚠️ These results need re-testing to confirm current performance
The performance improvements come primarily from optimized I/O patterns and SIMD-accelerated Reed-Solomon operations. See docs/BENCHMARK_RESULTS.md for comprehensive end-to-end benchmarks and docs/SIMD_OPTIMIZATION.md for SIMD implementation details.
# Run directly without installing
nix run github:mjc/par2rs -- verify myfile.par2
# Install to your profile
nix profile install github:mjc/par2rs
# Use in a flake.nix
{
inputs.par2rs.url = "github:mjc/par2rs";
# Then use as: inputs.par2rs.packages.${system}.default
}# Clone the repository
git clone https://github.com/mjc/par2rs.git
cd par2rs
# Build the project
cargo build --release
# Binaries will be in target/release/
# - par2 (unified interface, par2cmdline compatible)
# - par2verify, par2repair, par2create (individual tools)The par2 binary provides a par2cmdline-compatible interface:
# Verify files
par2 verify myfile.par2
par2 v myfile.par2 # short form
# Repair damaged files
par2 repair myfile.par2
par2 r myfile.par2 # short form
# Create recovery files (coming soon)
par2 create myfile.par2 file1 file2
par2 c myfile.par2 file1 file2 # short form# Quiet mode (minimal output)
par2 v -q myfile.par2
# Repair and purge backup files on success
par2 r -p myfile.par2
# Use specific number of threads
par2 v -t 8 myfile.par2
# Disable parallel processing (single-threaded)
par2 v --no-parallel myfile.par2Individual binaries are also available:
# Verify integrity of files protected by PAR2
cargo run --bin par2verify tests/fixtures/testfile.par2use par2rs::{parse_packets, analysis, file_verification};
use std::fs::File;
// Parse PAR2 packets from a file
let mut file = File::open("example.par2")?;
let packets = parse_packets(&mut file);
// Analyze the PAR2 set
let stats = analysis::calculate_par2_stats(&packets, 0);
analysis::print_summary_stats(&stats);
// Verify file integrity
let file_info = analysis::collect_file_info_from_packets(&packets);
let results = file_verification::verify_files_and_collect_results(&file_info, true);| Packet Type | Description | Status |
|---|---|---|
| Main Packet | Core metadata and file list | ✅ Implemented |
| Packed Main Packet | Compressed main packet variant | ✅ Implemented |
| File Description | File metadata and checksums | ✅ Implemented |
| Input File Slice Checksum | Slice-level checksums | ✅ Implemented |
| Recovery Slice | Reed-Solomon recovery data | ✅ Implemented |
| Creator | Software identification | ✅ Implemented |
packets/: Binary packet parsing and serialization usingbinrwanalysis.rs: PAR2 set analysis and statistics calculationverify.rs: File integrity verification with MD5 checksumsfile_ops.rs: File discovery and PAR2 collection managementfile_verification.rs: Comprehensive file verification with detailed results
- Rust: 1.70+ (see
rust-toolchain.tomlfor exact version) - Optional Tools:
cargo-llvm-cov:cargo install cargo-llvm-cov(for code coverage)
# Debug build
cargo build
# Release build (optimized)
cargo build --release
# Build all binaries
cargo build --release --bins# Run all tests
cargo test
# Run specific test suites
cargo test --test test_unit # Unit tests
cargo test --test test_integration # Integration tests
cargo test --test test_packets # Packet serialization tests
cargo test --test test_verification # Verification tests
# Run tests with output
cargo test -- --nocapture
# Run specific test
cargo test test_main_packet_fieldsThe project includes comprehensive code coverage tools:
# Quick coverage summary
make coverage-quick
# Generate HTML coverage report
make coverage-html
# Open coverage report in browser
make coverage-open
# Generate coverage for CI (multiple formats)
make coverage-ci
# LLVM-based coverage
make coverage-llvm
# Compare both tools
make coverage-bothFor detailed coverage options, see COVERAGE.md.
Coverage reports are automatically generated on every commit and pull request via GitHub Actions.
tests/
├── test_unit.rs # Unit tests for core functionality
├── test_integration.rs # End-to-end integration tests
├── test_packets.rs # Packet parsing and serialization
├── test_verification.rs # File verification tests
├── fixtures/ # Test PAR2 files and data
└── unit/ # Detailed unit test modules
├── analysis.rs
├── file_ops.rs
├── file_verification.rs
└── repair.rs
The project includes comprehensive test fixtures:
- Real PAR2 Files:
testfile.par2with volume files - Individual Packets: Isolated packet files for focused testing
- Repair Scenarios: Test files for repair functionality
testfile_corrupted: File with single corruption pointtestfile_heavily_corrupted: File with multiple corruption pointsrepair_scenarios/: PAR2 files without data files (missing file scenario)
- Corrupted Data: Test cases for error handling
Verifies the integrity of files using PAR2 archives.
Features:
- Complete PAR2 set analysis
- File integrity verification
- Progress reporting
- Detailed statistics
Creates PAR2 recovery files for data protection.
Repairs corrupted files using PAR2 recovery data.
Development utility to split PAR2 files into individual packets for analysis.
- Parallel Processing: Multi-threaded operations using Rayon
- Memory Efficient: Streaming packet parser
- Fast Verification: Optimized MD5 checksumming
- Minimal Dependencies: Carefully selected crate dependencies
- binrw: Binary reading/writing with derive macros
- md5: Fast MD5 hashing implementation
- rayon: Data parallelism library
- clap: Command-line argument parsing
- hex: Hexadecimal encoding/decoding
- cargo-llvm-cov: Code coverage analysis
- criterion: Benchmarking framework
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes with tests
- Run the test suite:
cargo test - Check coverage:
make coverage-html - Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request
- Code Quality: All code must pass
cargo clippyandcargo fmt - Test Coverage: Maintain high test coverage (aim for >90%)
- Documentation: Document all public APIs with examples
- Performance: Consider performance implications of changes
This implementation follows the PAR2 specification and supports:
- PAR2 2.0 Specification: Full compliance with the standard
- Multiple Recovery Volumes: Support for volume files
- Variable Block Sizes: Flexible slice size configuration
- Reed-Solomon Codes: Error correction mathematics
par2rs uses a block-aligned sequential scanning approach that differs from par2cmdline's sliding window scanner:
-
par2cmdline: Uses a byte-by-byte sliding window with rolling CRC32 that can find blocks at any offset in a file, even if displaced by inserted/deleted data. This is more thorough but slower.
-
par2rs: Only checks blocks at their expected aligned positions using sequential reads with large buffers (128MB). This is significantly faster for normal verification but cannot find displaced blocks.
Practical Impact:
- ✅ par2rs is faster for standard verification/repair scenarios (files are either intact or corrupted at known positions)
⚠️ par2cmdline is more robust for edge cases like files with prepended data or non-aligned block corruption- 🎯 For typical use cases (bit rot, transmission errors, filesystem corruption), both tools will perform equivalently
This design choice optimizes for the common case where files are either intact or have corruption at expected block boundaries, delivering substantial performance improvements while maintaining correctness for standard PAR2 operations.
- Repair Hanging: The repair functionality occasionally hangs on small files within large multi-file PAR2 sets. The root cause is still under investigation. Workaround: Process smaller PAR2 sets or single files where possible.
- Phase 1: Complete packet parsing and verification
- Phase 2: PAR2 file creation (
par2create) - Phase 3: File repair functionality (
par2repair) - Phase 4: SIMD optimizations (PSHUFB, NEON, portable_simd)
- Phase 5: Runtime SIMD dispatch
- Phase 6: Advanced features (progress callbacks, custom block sizes)
- README.md: This file - project overview and quick start
- BENCHMARK_RESULTS.md: Comprehensive end-to-end performance benchmarks
- SIMD_OPTIMIZATION.md: Technical details on SIMD implementations
- COVERAGE.md: Code coverage tooling and instructions
- par2_parsing.md: Internal implementation notes (development reference)
This project is licensed under the MIT License - see the LICENSE file for details.
- PAR2 Specification: Based on the PAR2 format specification
- par2cmdline: Reference implementation for compatibility testing
- Rust Community: For excellent crates and tooling ecosystem