-
-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Non-security findings from code audit
Verified bugs, packaging issues, and efficiency improvements.
1. [BUG] _find_paper_by_source_url crashes if PAPERS_DIR doesn't exist
paperpipe/cli/papers.py:75 — config.PAPERS_DIR.iterdir() is called without an existence check. Raises FileNotFoundError when invoked before the directory has been bootstrapped (e.g., first run, or when called outside the normal CLI entry point).
Fix: Guard with if not config.PAPERS_DIR.exists(): return None before iterating.
2. [BUG] Temp file leak on unexpected exceptions in download_source
paperpipe/paper.py:140-201 — tar_path.unlink() at line 201 is outside a finally block. The try/except at line 145 only catches tarfile.ReadError. Any other exception (e.g., MemoryError, OSError from tar.getmembers()) skips cleanup and leaks the temp file.
Fix: Move tar_path.unlink() into a finally block:
try:
...
except tarfile.ReadError:
...
finally:
tar_path.unlink(missing_ok=True)3. [PERF] Nested O(refs × members) scan in figure extraction
paperpipe/paper.py:305-318 — For each \includegraphics reference and each candidate extension, the code iterates all tar members (tar.getmembers()). This is O(refs × candidates × members).
Fix: Build a dict of tar members keyed by normalized name once, then do O(1) lookups per candidate:
member_index = {m.name.replace("\\", "/"): m for m in tar.getmembers() if m.isfile()}
# Also index by basename for partial matching4. [MINOR] Duplicate URL check scans all paper directories on every add
paperpipe/cli/papers.py:75-93 — Every add call iterates the full PAPERS_DIR, opening and parsing each meta.json. O(n) per add. Not a bug per se — acceptable for personal paper collections — but worth noting if the DB grows large.
Lower priority. Could be addressed later with a URL index file if needed.
5. [BUG] _setup_debug_logging adds duplicate handlers on repeated calls
paperpipe/output.py:15-21 — Every call to _setup_debug_logging() unconditionally appends a new StreamHandler. In long-lived processes (MCP server) or tests, this duplicates log output and adds overhead.
Fix: Guard against re-adding:
def _setup_debug_logging() -> None:
if _debug_logger.handlers:
return
...6. [BUG] Packaging: build includes wrong paths, ships stale backup file
pyproject.toml:85-86, 97-98— Wheel and sdistonly-includereferenceskill(singular) andprompts, but the actual directory isskills(plural) and nopromptsdirectory exists. Skills are missing from built distributions, sopapi install skillfails post-install.paperpipe/install.py:23— Runtime installer correctly looks forskills.paperpipe/cli.py.backup— Stale backup file is tracked in git and shipped in the wheel.
Fix: Change skill → skills, remove prompts in pyproject.toml build targets. Run git rm paperpipe/cli.py.backup and optionally add *.backup to .gitignore.