PressRs is a custom data compression tool built from scratch in Rust. It combines a TAR-like packager with an LZW (Lempel–Ziv–Welch) compressor to reduce file sizes while preserving directory structures.
It is designed as an educational project to demonstrate low-level bit manipulation, dictionary-based compression algorithms, and file stream handling in Rust.
- Custom LZW Implementation: Uses variable-width codes (9 to 12 bits) with dynamic dictionary resetting.
- Archive Capability: Bundles multiple files and directories into a single
.pressrsfile. - Memory Efficient: Streams data using buffered readers/writers to handle large files.
- Web Compatibility: Designed for both native and WASM environments.
- No external Dependencies: Built from scratch without any libs/frameworks.
- CLI: Simple text-based interface for selecting modes.
| Operation | Input Size | Throughput | Time |
|---|---|---|---|
| Compression | 1 KB | 81 MiB/s | 12 µs |
| Compression | 100 KB | 103 MiB/s | 945 µs |
| Compression | 1 MB | 102 MiB/s | 9.8 ms |
| Decompression | 1 KB | 152 MiB/s | 6.4 µs |
| Decompression | 100 KB | 171 MiB/s | 570 µs |
| Pack (small) | 3 entries | - | 475 ns |
| Unpack (small) | 5 KB archive | - | 328 ns |
| Full Pipeline | Pack + Compress 100 KB | 101 MiB/s | 965 µs |
| Full Pipeline | Decompress + Unpack 100 KB | 175 MiB/s | 576 µs |
Benchmarks run on intel i5-11400H wsl ubuntu. LZW compression algorithm.
- Linear scaling: 2x data = 2x time (stable ~100 MiB/s)
- Decompression faster: 1.7x faster than compression
- Low overhead: Small files (<1KB) see ~20% overhead
- Predictable: 1MB ≈ 10ms, 100MB ≈ 1
The core of PressRs is the LZW algorithm.
- Dictionary: Starts with a default ASCII set (0-255).
- Dynamic Growth: As patterns are found, new codes are added to the dictionary.
- Variable Bit Width: The output code size starts at 9 bits and grows up to 12 bits as the dictionary fills up.
- Reset Mechanism: Once the dictionary reaches its limit (4096 entries), it sends a
Clear Codeand resets, preventing memory overflow and adapting to new data patterns.
PressRs uses a custom binary format similar to TAR:
- It traverses the target directory recursively.
- Each file is preceded by a metadata header (containing relative path and size).
- The continuous stream of file data is then passed to the LZW compressor.
Ensure you have Rust installed, then clone and build:
git clone https://github.com/TeseySTD/press.rs.git
cd press.rs
cargo build --releaseExecutable will be at ./target/release/press_rs