Conversation
…i); preserve CLI behavior; add tsconfig; update build/bin
…/outputs and plugging in custom adapters
…public API for adapters
There was a problem hiding this comment.
Pull Request Overview
This PR refactors the monolithic CLI into a modular pipeline architecture with clear separation of concerns: sources → transforms → outputs → core → cli. The refactor establishes a foundation for adding new input sources and output formats while maintaining full backward compatibility with existing CLI flags and behavior.
Key changes:
- Extracted core types, utilities, and CLI arg parsing into
src/core/types.ts - Created modular source adapter (
src/sources/twitter.ts) for Twitter/X archive ingestion - Separated transform logic (
src/transforms/core.ts) for filtering, grouping, and conversation mapping - Isolated output writers (
src/outputs/writers.ts) for Markdown, OAI, JSONL, ShareGPT, and stats - Introduced public library API (
src/index.ts) with pluggable adapter interfaces
Reviewed Changes
Copilot reviewed 8 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/transforms/core.ts | Text cleaning, filtering, thread grouping, and conversation-to-message conversion logic |
| src/sources/twitter.ts | Twitter/X archive detection and ingestion with media handling |
| src/outputs/writers.ts | Writers for Markdown, OAI JSONL, normalized JSONL, ShareGPT, and stats formats |
| src/index.ts | Public library API exposing types, adapters, and extension interfaces |
| src/core/types.ts | Shared types, CLI argument parsing, logger, and utility functions |
| src/cli/splice.ts | CLI entrypoint orchestrating the modular pipeline |
| package.json | Updated main entry point to dist/index.js for library usage |
| README.md | Enhanced documentation with architecture overview and library usage examples |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
src/transforms/core.ts
Outdated
| } | ||
|
|
||
| for (const it of items) { | ||
| const role: Role = it.raw && "full_text" in (it.raw as any) ? "assistant" : "user"; |
There was a problem hiding this comment.
The role assignment logic uses an implicit type check with as any and a property existence check. This is fragile and unclear. Consider extracting this to a named function like inferRole(item: ContentItem): Role with explicit documentation of the heuristic being used.
package.json
Outdated
| ".": "./dist/index.js" | ||
| }, | ||
| "bin": { | ||
| "splice": "dist/splice.js" |
There was a problem hiding this comment.
The bin entry points to 'dist/splice.js' but according to the PR description and README changes, the CLI should be at 'dist/cli/splice.js'. This will cause the published package's CLI command to fail.
| "splice": "dist/splice.js" | |
| "splice": "dist/cli/splice.js" |
src/cli/splice.ts
Outdated
| const validFormats = requested.filter((f) => allowedFormats.has(f)); | ||
| const invalidFormats = requested.filter((f) => !allowedFormats.has(f)); | ||
| for (const bad of invalidFormats) { | ||
| logger("warn", `Unknown format "${bad}". Supported: markdown, oai, json`); |
There was a problem hiding this comment.
The error message lists 'markdown, oai, json' as supported formats but omits 'sharegpt' which is in the allowedFormats set (line 191). The message should include all supported formats.
src/cli/splice.ts
Outdated
| if (formatSpecified && validFormats.length === 0) { | ||
| logger( | ||
| "error", | ||
| "No valid formats requested. Supported: markdown, oai, json", |
There was a problem hiding this comment.
This error message also omits 'sharegpt' from the list of supported formats. Should be consistent with the allowedFormats set.
What
Why
Compatibility
Tests