diff --git a/CHANGELOG.md b/CHANGELOG.md index 6a2398a..c0304e4 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,20 @@ All notable changes to this project will be documented in this file. The format is based on Keep a Changelog, and this project adheres to Semantic Versioning. +## [2.1.2] - 2025-09-06 + +### Added + +- Visual documentation with strategy diagrams showing trimming and summarization workflows +- "Supported Strategies" section in README with visual explanations of trimming and summarization +- Project guidance documentation for Claude Code in CLAUDE.md + +### Changed + +- Enhanced examples with more comprehensive usage patterns and clearer explanations +- Consolidated LangChain examples: removed separate LANGCHAIN_EXAMPLE.md in favor of the more focused LANGCHAIN_COMPRESS_HISTORY.md +- Updated OPENAI_EXAMPLE.md with improved code samples and workflow explanations + ## [2.1.1] - 2025-09-04 ### Fixed diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..f8e9bcb --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,121 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +## Project Overview + +**slimcontext** is a TypeScript npm library for chat history compression in AI assistants. It provides model-agnostic strategies to keep conversations concise while preserving context using a "Bring Your Own Model" (BYOM) approach. + +## Development Commands + +### Package Management + +Use **pnpm** as the primary package manager (version 10.14.0), with npm as fallback: + +```bash +pnpm install # Primary method +npm install # Fallback if pnpm unavailable +``` + +### Core Development Workflow + +```bash +pnpm run build # Compile TypeScript to dist/ +pnpm run test # Run vitest tests +pnpm run test:watch # Run tests in watch mode +pnpm run lint # ESLint with TypeScript rules +pnpm run lint:fix # Auto-fix linting issues +pnpm run format # Format code with Prettier +pnpm run format:check # Check formatting without changes +``` + +### Required Validation Sequence + +Always run these commands after making changes: + +```bash +pnpm run test # All tests must pass +pnpm run lint # No linting errors +pnpm run format:check # Code must be formatted +pnpm run build # Must compile successfully +``` + +## Architecture + +### Core Design Principles + +- **Model-agnostic**: Core library has zero runtime dependencies +- **Token-aware**: Uses configurable token budgets with threshold-based compression +- **Framework-independent**: Optional adapters (currently LangChain) without core dependencies +- **Message preservation**: Always preserves system messages and recent conversation tail + +### Key Interfaces (src/interfaces.ts) + +- `SlimContextMessage`: Standard message format with role and content +- `SlimContextChatModel`: BYOM interface requiring only `invoke(messages) -> response` +- `SlimContextCompressor`: Strategy interface for compression implementations +- `TokenBudgetConfig`: Shared configuration for token-threshold behavior + +### Compression Strategies (src/strategies/) + +**TrimCompressor** (src/strategies/trim.ts): + +- Drops oldest non-system messages when over token threshold +- Preserves system messages and recent tail (configurable `minRecentMessages`) +- Simple token-based pruning without AI model dependency + +**SummarizeCompressor** (src/strategies/summarize.ts): + +- Uses provided chat model to summarize old conversation segments +- Injects summary as system message before preserved recent messages +- Complex message alignment and boundary detection logic + +**Shared utilities** (src/strategies/common.ts): + +- Token estimation defaults and configuration normalization +- `DEFAULT_MAX_MODEL_TOKENS = 8192`, `DEFAULT_THRESHOLD_PERCENT = 0.7` + +### Adapter Pattern (src/adapters/) + +- **LangChain adapter** (src/adapters/langchain.ts): + - `toSlimModel()`: Wraps LangChain chat models + - `compressLangChainHistory()`: One-call compression helper + - Message format conversions between LangChain and slimcontext + +## File Structure + +``` +src/ +├── index.ts # Main exports +├── interfaces.ts # Core type definitions +├── strategies/ # Compression implementations +│ ├── common.ts # Shared utilities and defaults +│ ├── trim.ts # Token-based trimming strategy +│ └── summarize.ts # AI-powered summarization strategy +└── adapters/ # Framework integrations + └── langchain.ts # LangChain integration and helpers + +tests/ # vitest test files +examples/ # Markdown documentation only +dist/ # Compiled output (generated) +``` + +## Testing + +- **Framework**: vitest +- **Coverage**: TrimCompressor, SummarizeCompressor, LangChain adapter +- **Test files**: tests/\*.test.ts corresponding to src/ structure + +## Build Configuration + +- **TypeScript**: Target ES2020, output CommonJS modules +- **Output**: dist/ directory with .js and .d.ts files +- **Exports**: Main library + separate LangChain adapter path +- **ESLint**: TypeScript rules with import ordering and unused import detection + +## Important Notes + +- The `prepare` script automatically runs build after npm install +- Examples directory contains documentation only, not executable code +- LangChain integration is optional (peer dependency) +- All compression strategies share token budget configuration pattern diff --git a/README.md b/README.md index 3fd8a91..fe3a0b3 100644 --- a/README.md +++ b/README.md @@ -4,10 +4,24 @@ Lightweight, model-agnostic chat history compression utilities for AI assistants ![CI](https://github.com/agentailor/slimcontext/actions/workflows/ci.yml/badge.svg) +## Supported Strategies + +### Trimming + +Simple token-based compression that removes the oldest messages when your conversation exceeds the token threshold. Always preserves system messages and the most recent messages to maintain context continuity. + +![Trimming Strategy](docs/images/trimming-strategy.png) + +### Summarization + +AI-powered compression that uses your own chat model to create concise summaries of older conversation segments. The summary is injected as a system message, preserving the conversation flow while drastically reducing token usage. + +![Summarization Strategy](docs/images/summarization-strategy.png) + ## Examples - OpenAI: see [examples/OPENAI_EXAMPLE.md](https://github.com/agentailor/slimcontext/blob/main/examples/OPENAI_EXAMPLE.md) (copy-paste snippet; BYOM, no deps added here). -- LangChain: see [examples/LANGCHAIN_EXAMPLE.md](https://github.com/agentailor/slimcontext/blob/main/examples/LANGCHAIN_EXAMPLE.md) and [examples/LANGCHAIN_COMPRESS_HISTORY.md](https://github.com/agentailor/slimcontext/blob/main/examples/LANGCHAIN_COMPRESS_HISTORY.md). +- LangChain: see [examples/LANGCHAIN_COMPRESS_HISTORY.md](https://github.com/agentailor/slimcontext/blob/main/examples/LANGCHAIN_COMPRESS_HISTORY.md). ## Features @@ -128,7 +142,6 @@ You can chain strategies depending on token thresholds or other heuristics. ## Example Integration - See [examples/OPENAI_EXAMPLE.md](https://github.com/agentailor/slimcontext/blob/main/examples/OPENAI_EXAMPLE.md) for an OpenAI copy-paste snippet. -- See [examples/LANGCHAIN_EXAMPLE.md](https://github.com/agentailor/slimcontext/blob/main/examples/LANGCHAIN_EXAMPLE.md) for a LangChain-style integration. - See [examples/LANGCHAIN_COMPRESS_HISTORY.md](https://github.com/agentailor/slimcontext/blob/main/examples/LANGCHAIN_COMPRESS_HISTORY.md) for a one-call LangChain history compression helper. ## Adapters diff --git a/docs/images/summarization-strategy.png b/docs/images/summarization-strategy.png new file mode 100644 index 0000000..cd47965 Binary files /dev/null and b/docs/images/summarization-strategy.png differ diff --git a/docs/images/trimming-strategy.png b/docs/images/trimming-strategy.png new file mode 100644 index 0000000..9b7ab98 Binary files /dev/null and b/docs/images/trimming-strategy.png differ diff --git a/examples/LANGCHAIN_COMPRESS_HISTORY.md b/examples/LANGCHAIN_COMPRESS_HISTORY.md index cc0b212..c880e3a 100644 --- a/examples/LANGCHAIN_COMPRESS_HISTORY.md +++ b/examples/LANGCHAIN_COMPRESS_HISTORY.md @@ -12,32 +12,79 @@ const llm = new ChatOpenAI({ model: 'gpt-5-mini', temperature: 0 }); // 2) Build your existing LangChain-compatible history const history = [ - new SystemMessage('You are a helpful assistant.'), - new HumanMessage('Hi! Help me plan a 3-day trip to Tokyo.'), - new AIMessage('Sure, what are your interests?'), - // ... many more messages + new SystemMessage( + "You are a helpful AI assistant. The user's name is Bob and he wants to plan a trip to Paris.", + ), + new HumanMessage('Hi, can you help me plan a trip?'), + new AIMessage('Of course, Bob! Where are you thinking of going?'), + new HumanMessage('I want to go to Paris.'), + new AIMessage('Great choice! Paris is a beautiful city. What do you want to do there?'), + new HumanMessage('I want to visit the Eiffel Tower and the Louvre.'), + new AIMessage('Those are two must-see attractions! Do you have a preferred time to visit?'), + new HumanMessage('I was thinking sometime in June.'), + new AIMessage('June is a great time to visit Paris! The weather is usually pleasant.'), + new HumanMessage('What about flights?'), + new AIMessage('I found some great flight options for you.'), + new HumanMessage('Can you show me the details?'), + new AIMessage( + 'Here are the details for the flights I found:\n\n- Flight 1: Departing June 1st, 10:00 AM\n- Flight 2: Departing June 2nd, 2:00 PM\n- Flight 3: Departing June 3rd, 5:00 PM', + ), + new HumanMessage( + "I like the second flight option. I will check it out later, let's talk about hotels.", + ), + new AIMessage( + 'The best hotel options in Paris are:\n\n- Hôtel de Ville\n- Le Meurice\n- Hôtel Plaza Athénée', + ), + // ...imagine more messages here about restaurants, activities, booking details... + // This is our latest message + new HumanMessage('Okay, can you summarize the whole plan for me in a bulleted list?'), ]; +console.log('Original size:', history.length); + // 3) Compress with either summarize (default) or trim strategy const compact = await langchain.compressLangChainHistory(history, { - strategy: 'summarize', - llm, // pass your BaseChatModel - maxModelTokens: 8192, - thresholdPercent: 0.8, - minRecentMessages: 4, + strategy: 'summarize', // Use AI summarization strategy + llm, // pass your BaseChatModel for generating summaries + maxModelTokens: 200, // Your model's context window size + thresholdPercent: 0.8, // Trigger compression when 80% of tokens are used + minRecentMessages: 4, // Always keep the last 4 messages untouched }); // Alternatively, use trimming without an LLM: const trimmed = await langchain.compressLangChainHistory(history, { - strategy: 'trim', - maxModelTokens: 8192, - thresholdPercent: 0.8, - minRecentMessages: 4, + strategy: 'trim', // Simple trimming strategy (no AI needed) + maxModelTokens: 200, // Your model's context window size + thresholdPercent: 0.8, // Trigger compression when 80% of tokens are used + minRecentMessages: 4, // Always keep the last 4 messages untouched }); -console.log('Original size:', history.length); console.log('Summarized size:', compact.length); console.log('Trimmed size:', trimmed.length); +console.log('Compressed messages:', compact); +``` + +After running this, you'll have something like this as output (for the summarization): + +```ts +[ + new SystemMessage( + "You are a helpful AI assistant. The user's name is Bob and he wants to plan a trip to Paris.", + ), + new SystemMessage( + 'User (Bob) wants help planning a trip to Paris in June to visit the Eiffel Tower and the Louvre.\nAssistant confirmed June is a good time and reported finding flight options.\nUser asked the assistant to show the flight details.', + ), + new AIMessage( + 'Here are the details for the flights I found:\n\n- Flight 1: Departing June 1st, 10:00 AM\n- Flight 2: Departing June 2nd, 2:00 PM\n- Flight 3: Departing June 3rd, 5:00 PM', + ), + new HumanMessage( + "I like the second flight option. I will check it out later, let's talk about hotels.", + ), + new AIMessage( + 'The best hotel options in Paris are:\n\n- Hôtel de Ville\n- Le Meurice\n- Hôtel Plaza Athénée', + ), + new HumanMessage('Okay, can you summarize the whole plan for me in a bulleted list?'), +]; ``` Notes diff --git a/examples/LANGCHAIN_EXAMPLE.md b/examples/LANGCHAIN_EXAMPLE.md deleted file mode 100644 index 7d1260f..0000000 --- a/examples/LANGCHAIN_EXAMPLE.md +++ /dev/null @@ -1,58 +0,0 @@ -# LangChain-style summarization example (copy-paste) - -This library is framework-agnostic. Here’s how you might adapt a LangChain-style chat model to `SlimContextChatModel` and use `SummarizeCompressor`. - -```ts -import { - SummarizeCompressor, - type SlimContextChatModel, - type SlimContextMessage, - type SlimContextModelResponse, -} from 'slimcontext'; -import { ChatOpenAI } from '@langchain/openai'; // or any LangChain chat model - -// Create a LangChain model (reads from env, e.g., OPENAI_API_KEY) -const lc = new ChatOpenAI({ model: 'gpt-5-mini', temperature: 0 }); - -class LangChainModel implements SlimContextChatModel { - async invoke(messages: SlimContextMessage[]): Promise { - // Map slimcontext messages to LangChain's format - const lcMessages = messages.map((m) => { - const role = m.role === 'human' ? 'user' : m.role; - return { role, content: m.content } as { - role: 'system' | 'user' | 'assistant'; - content: string; - }; - }); - - const res = await lc.invoke(lcMessages); - const content = - typeof res?.content === 'string' - ? res.content - : Array.isArray(res?.content) - ? res.content.map((c: any) => c?.text ?? '').join('\n') - : ''; - return { content }; - } -} - -async function compress(history: SlimContextMessage[]) { - const summarize = new SummarizeCompressor({ - model: new LangChainModel(), - maxModelTokens: 8192, - thresholdPercent: 0.75, - minRecentMessages: 4, - }); - return summarize.compress(history); -} - -// Usage example: -// const history: SlimContextMessage[] = [ { role: 'system', content: 'You are helpful' }, ... ]; -// const compact = await compress(history); -``` - -Notes: - -- Choose any LangChain chat model; `ChatOpenAI` is just an example. -- Make sure to map roles properly (convert `'human'` to `'user'` if it appears). -- Keep the first system message, slimcontext will insert a summary when over the threshold. diff --git a/examples/OPENAI_EXAMPLE.md b/examples/OPENAI_EXAMPLE.md index 98fdb45..fbc6e17 100644 --- a/examples/OPENAI_EXAMPLE.md +++ b/examples/OPENAI_EXAMPLE.md @@ -28,34 +28,98 @@ class OpenAIModel implements SlimContextChatModel { async function main() { const history: SlimContextMessage[] = [ - { role: 'system', content: 'You are a helpful assistant.' }, - { role: 'user', content: 'Hello' }, - // ... conversation grows + { + role: 'system', + content: + "You are a helpful AI assistant. The user's name is Bob and he wants to plan a trip to Paris.", + }, + { role: 'user', content: 'Hi, can you help me plan a trip?' }, + { role: 'assistant', content: 'Of course, Bob! Where are you thinking of going?' }, + { role: 'user', content: 'I want to go to Paris.' }, + { + role: 'assistant', + content: 'Great choice! Paris is a beautiful city. What do you want to do there?', + }, + { role: 'user', content: 'I want to visit the Eiffel Tower and the Louvre.' }, + { + role: 'assistant', + content: 'Those are two must-see attractions! Do you have a preferred time to visit?', + }, + { role: 'user', content: 'I was thinking sometime in June.' }, + { + role: 'assistant', + content: 'June is a great time to visit Paris! The weather is usually pleasant.', + }, + { role: 'user', content: 'What about flights?' }, + { role: 'assistant', content: 'I found some great flight options for you.' }, + { role: 'user', content: 'Can you show me the details?' }, + { + role: 'assistant', + content: + 'Here are the details for the flights I found:\n\n- Flight 1: Departing June 1st, 10:00 AM\n- Flight 2: Departing June 2nd, 2:00 PM\n- Flight 3: Departing June 3rd, 5:00 PM', + }, + { + role: 'user', + content: + "I like the second flight option. I will check it out later, let's talk about hotels.", + }, + { + role: 'assistant', + content: + 'The best hotel options in Paris are:\n\n- Hôtel de Ville\n- Le Meurice\n- Hôtel Plaza Athénée', + }, + // ...imagine 50 more messages here about flights, hotels, restaurants... + // This is our latest message + { role: 'user', content: 'Okay, can you summarize the whole plan for me in a bulleted list?' }, ]; - + console.log('Original messages length:', history.length); const summarize = new SummarizeCompressor({ - model: new OpenAIModel(), - maxModelTokens: 128000, - thresholdPercent: 0.8, - minRecentMessages: 4, + model: new OpenAIModel(), // Your BYOM implementation for generating summaries + maxModelTokens: 200, // Your model's context window size + thresholdPercent: 0.8, // Trigger compression when 80% of tokens are used + minRecentMessages: 4, // Always keep the last 4 messages untouched }); const compressed = await summarize.compress(history); + console.log('Compressed messages length:', compressed.length); - const completion = await client.chat.completions.create({ - model: 'gpt-5-mini', - messages: compressed - .filter((m) => m.role !== 'tool') - .map((m) => ({ role: m.role as 'system' | 'user' | 'assistant', content: m.content })), - }); - - console.log(completion.choices?.[0]?.message?.content ?? ''); + console.log('Compressed messages:', compressed); } - main(); ``` +After running this, you'll have something like this as output: + +```JSON +[ + { + "role": "system", + "content": "You are a helpful AI assistant. The user's name is Bob and he wants to plan a trip to Paris." + }, + { + "role": "system", + "content": "User (Bob) wants help planning a trip to Paris in June to visit the Eiffel Tower and the Louvre.\nAssistant confirmed June is a good time and reported finding flight options.\nUser asked the assistant to show the flight details." + }, + { + "role": "assistant", + "content": "Here are the details for the flights I found:\n- Flight 1: Departing June 1st, 10:00 AM\n- Flight 2: Departing June 2nd, 2:00 PM\n- Flight 3: Departing June 3rd, 5:00 PM" + }, + { + "role": "user", + "content": "I like the second flight option. I will check it out later, let's talk about hotels." + }, + { + "role": "assistant", + "content": "Some top hotel options in Paris include:\n- Le Meurice\n- Hôtel Plaza Athénée\n- Hôtel de Crillon" + }, + { + "role": "user", + "content": "Okay, can you summarize the whole plan for me in a bulleted list?" + } +] +``` + Notes: - Set `OPENAI_API_KEY` in your environment. -- Pick any chat model available in your account (e.g., `gpt-4o`, `gpt-4.1-mini`, etc.). +- Pick any chat model available in your account (e.g., `gpt-4o`, `gpt-5.1-mini`, etc.). - You can swap OpenAI for any provider by implementing `SlimContextChatModel`. diff --git a/package.json b/package.json index 71fa999..b32e78c 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "slimcontext", - "version": "2.1.1", + "version": "2.1.2", "description": "Lightweight, model-agnostic chat history compression (trim + summarize) for AI assistants.", "main": "dist/index.js", "types": "dist/index.d.ts",