Small, local, lightweight PDF utility CLI for Windows. Everything runs offline once dependencies are installed. Front-end through Obsidian plugin availible at https://github.com/duck-lint/pdf-toolkit-obsidian-plugin
Features:
- Render PDF pages to PNGs (PyMuPDF)
- Split a PDF into multiple PDFs
- Rotate PDF pages or rotate PNGs (Pillow)
- Split spread scans into single-page images and crop page bounds (Pillow)
- Safe defaults with
--dry-runand--overwrite - JSON manifest written for each command
- Create and activate a virtual environment (optional but recommended):
python -m venv .venv
.venv\Scripts\Activate.ps1- Install dependencies:
pip install -r requirements.txt- Install this package in editable mode so
python -m pdf-toolkitworks:
pip install -e .If you prefer not to install it, you can temporarily set PYTHONPATH:
$env:PYTHONPATH = "src"See all commands:
python -m pdf-toolkit --helppython -m pdf-toolkit render --pdf "in.pdf" --out_dir "out\pages" --dpi 300 --format png --prefix "book1"Dry-run (no files written):
python -m pdf-toolkit render --pdf "in.pdf" --out_dir "out\pages" --pages "1-10,15" --dry-runOutput naming is predictable:
book1_p0001.png, book1_p0002.png, etc.
Explicit ranges:
python -m pdf-toolkit split --pdf "in.pdf" --out_dir "out\splits" --ranges "1-120,121-240" --prefix "book"Automatic chunking:
python -m pdf-toolkit split --pdf "in.pdf" --out_dir "out\splits" --pages_per_file 120 --prefix "book"Outputs:
book_part01.pdf, book_part02.pdf, etc.
python -m pdf-toolkit rotate pdf --pdf "in.pdf" --out_pdf "in_rotated.pdf" --degrees 90 --pages "all"In-place (overwrites input):
python -m pdf-toolkit rotate pdf --pdf "in.pdf" --out_pdf "in.pdf" --degrees 180 --pages "1-5" --inplace --overwritepython -m pdf-toolkit rotate images --in_dir "out\pages" --glob "*.png" --degrees 90 --out_dir "out\pages_rot"In-place (overwrites files):
python -m pdf-toolkit rotate images --in_dir "out\pages" --glob "*.png" --degrees 90 --out_dir "out\pages" --inplace --overwriteAuto mode (split if wide enough, otherwise crop-only):
python -m pdf-toolkit page-images --in_dir "out\pages" --out_dir "out\pages_single" --glob "*.png" --mode auto --debugAlways split:
python -m pdf-toolkit page-images --in_dir "out\pages" --out_dir "out\pages_single" --mode split --overwriteNever split (crop-only):
python -m pdf-toolkit page-images --in_dir "out\pages" --out_dir "out\pages_single" --mode cropDump the default YAML config:
python -m pdf-toolkit page-images --dump-default-configUse a config file:
python -m pdf-toolkit page-images --in_dir "out\pages" --out_dir "out\pages_single" --config "configs\page_images.default.yaml"Precedence is deterministic:
built-in defaults < YAML config < explicitly provided CLI flags.
This means optional CLI defaults do not overwrite YAML values unless the flag is explicitly passed.
Supported YAML shapes:
Root form:
mode: auto
split_ratio: 1.25
crop_threshold: 180
pad_px: 20Wrapped form:
page_images:
mode: auto
split_ratio: 1.25
gutter_search_frac: 0.35
crop_threshold: 180
min_area_frac: 0.25Recommended pipeline:
render -> page-images
Pages are 1-based for user input:
all1-101-10,15,20-25
Each command writes a JSON manifest describing:
- Inputs, outputs, options
- Actions taken (written, skipped, dry-run)
- Timestamps
page-images action outputs list the written files, plus split/crop metadata (gutter_x, bboxes, spread detection notes).
["out/pages_single/book_p0001_L.png", "out/pages_single/book_p0001_R.png"]By default the manifest is written to:
- Render:
out_dir\manifest.json - Split:
out_dir\manifest.json - Rotate PDF:
out_pdffolder\manifest.json - Rotate images:
out_dir\manifest.json - Page-images:
out_dir\manifest.json
--dry-run skips writing the manifest (it is treated like an output file).
Run the minimal unit tests:
python -m unittest discover -s tests -p "test_*.py"Prefer creating release zips from git history:
git archive --format=zip --output pdf-toolkit.zip HEADIf you zip manually, delete __pycache__/ folders and *.pyc files first.