agentailor · IBJunior · Sep 6, 2025 · Sep 6, 2025
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,6 +4,20 @@ All notable changes to this project will be documented in this file.
 
 The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
 
+## [2.1.2] - 2025-09-06
+
+### Added
+
+- Visual documentation with strategy diagrams showing trimming and summarization workflows
+- "Supported Strategies" section in README with visual explanations of trimming and summarization
+- Project guidance documentation for Claude Code in CLAUDE.md
+
+### Changed
+
+- Enhanced examples with more comprehensive usage patterns and clearer explanations
+- Consolidated LangChain examples: removed separate LANGCHAIN_EXAMPLE.md in favor of the more focused LANGCHAIN_COMPRESS_HISTORY.md
+- Updated OPENAI_EXAMPLE.md with improved code samples and workflow explanations
+
 ## [2.1.1] - 2025-09-04
 
 ### Fixed

diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,121 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Project Overview
+
+**slimcontext** is a TypeScript npm library for chat history compression in AI assistants. It provides model-agnostic strategies to keep conversations concise while preserving context using a "Bring Your Own Model" (BYOM) approach.
+
+## Development Commands
+
+### Package Management
+
+Use **pnpm** as the primary package manager (version 10.14.0), with npm as fallback:
+
+```bash
+pnpm install           # Primary method
+npm install            # Fallback if pnpm unavailable
+```
+
+### Core Development Workflow
+
+```bash
+pnpm run build         # Compile TypeScript to dist/
+pnpm run test          # Run vitest tests
+pnpm run test:watch    # Run tests in watch mode
+pnpm run lint          # ESLint with TypeScript rules
+pnpm run lint:fix      # Auto-fix linting issues
+pnpm run format        # Format code with Prettier
+pnpm run format:check  # Check formatting without changes
+```
+
+### Required Validation Sequence
+
+Always run these commands after making changes:
+
+```bash
+pnpm run test          # All tests must pass
+pnpm run lint          # No linting errors
+pnpm run format:check  # Code must be formatted
+pnpm run build         # Must compile successfully
+```
+
+## Architecture
+
+### Core Design Principles
+
+- **Model-agnostic**: Core library has zero runtime dependencies
+- **Token-aware**: Uses configurable token budgets with threshold-based compression
+- **Framework-independent**: Optional adapters (currently LangChain) without core dependencies
+- **Message preservation**: Always preserves system messages and recent conversation tail
+
+### Key Interfaces (src/interfaces.ts)
+
+- `SlimContextMessage`: Standard message format with role and content
+- `SlimContextChatModel`: BYOM interface requiring only `invoke(messages) -> response`
+- `SlimContextCompressor`: Strategy interface for compression implementations
+- `TokenBudgetConfig`: Shared configuration for token-threshold behavior
+
+### Compression Strategies (src/strategies/)
+
+**TrimCompressor** (src/strategies/trim.ts):
+
+- Drops oldest non-system messages when over token threshold
+- Preserves system messages and recent tail (configurable `minRecentMessages`)
+- Simple token-based pruning without AI model dependency
+
+**SummarizeCompressor** (src/strategies/summarize.ts):
+
+- Uses provided chat model to summarize old conversation segments
+- Injects summary as system message before preserved recent messages
+- Complex message alignment and boundary detection logic
+
+**Shared utilities** (src/strategies/common.ts):
+
+- Token estimation defaults and configuration normalization
+- `DEFAULT_MAX_MODEL_TOKENS = 8192`, `DEFAULT_THRESHOLD_PERCENT = 0.7`
+
+### Adapter Pattern (src/adapters/)
+
+- **LangChain adapter** (src/adapters/langchain.ts):
+  - `toSlimModel()`: Wraps LangChain chat models
+  - `compressLangChainHistory()`: One-call compression helper
+  - Message format conversions between LangChain and slimcontext
+
+## File Structure
+
+```
+src/
+├── index.ts                    # Main exports
+├── interfaces.ts              # Core type definitions
+├── strategies/               # Compression implementations
+│   ├── common.ts            # Shared utilities and defaults
+│   ├── trim.ts              # Token-based trimming strategy
+│   └── summarize.ts         # AI-powered summarization strategy
+└── adapters/                # Framework integrations
+    └── langchain.ts         # LangChain integration and helpers
+
+tests/                        # vitest test files
+examples/                     # Markdown documentation only
+dist/                         # Compiled output (generated)
+```
+
+## Testing
+
+- **Framework**: vitest
+- **Coverage**: TrimCompressor, SummarizeCompressor, LangChain adapter
+- **Test files**: tests/\*.test.ts corresponding to src/ structure
+
+## Build Configuration
+
+- **TypeScript**: Target ES2020, output CommonJS modules
+- **Output**: dist/ directory with .js and .d.ts files
+- **Exports**: Main library + separate LangChain adapter path
+- **ESLint**: TypeScript rules with import ordering and unused import detection
+
+## Important Notes
+
+- The `prepare` script automatically runs build after npm install
+- Examples directory contains documentation only, not executable code
+- LangChain integration is optional (peer dependency)
+- All compression strategies share token budget configuration pattern
diff --git a/README.md b/README.md
@@ -4,10 +4,24 @@ Lightweight, model-agnostic chat history compression utilities for AI assistants
 
 ![CI](https://github.com/agentailor/slimcontext/actions/workflows/ci.yml/badge.svg)
 
+## Supported Strategies
+
+### Trimming
+
+Simple token-based compression that removes the oldest messages when your conversation exceeds the token threshold. Always preserves system messages and the most recent messages to maintain context continuity.
+
+![Trimming Strategy](docs/images/trimming-strategy.png)
+
+### Summarization
+
+AI-powered compression that uses your own chat model to create concise summaries of older conversation segments. The summary is injected as a system message, preserving the conversation flow while drastically reducing token usage.
+
+![Summarization Strategy](docs/images/summarization-strategy.png)
+
 ## Examples
 
 - OpenAI: see [examples/OPENAI_EXAMPLE.md](https://github.com/agentailor/slimcontext/blob/main/examples/OPENAI_EXAMPLE.md) (copy-paste snippet; BYOM, no deps added here).
-- LangChain: see [examples/LANGCHAIN_EXAMPLE.md](https://github.com/agentailor/slimcontext/blob/main/examples/LANGCHAIN_EXAMPLE.md) and [examples/LANGCHAIN_COMPRESS_HISTORY.md](https://github.com/agentailor/slimcontext/blob/main/examples/LANGCHAIN_COMPRESS_HISTORY.md).
+- LangChain: see [examples/LANGCHAIN_COMPRESS_HISTORY.md](https://github.com/agentailor/slimcontext/blob/main/examples/LANGCHAIN_COMPRESS_HISTORY.md).
 
 ## Features
 
@@ -128,7 +142,6 @@ You can chain strategies depending on token thresholds or other heuristics.
 ## Example Integration
 
 - See [examples/OPENAI_EXAMPLE.md](https://github.com/agentailor/slimcontext/blob/main/examples/OPENAI_EXAMPLE.md) for an OpenAI copy-paste snippet.
-- See [examples/LANGCHAIN_EXAMPLE.md](https://github.com/agentailor/slimcontext/blob/main/examples/LANGCHAIN_EXAMPLE.md) for a LangChain-style integration.
 - See [examples/LANGCHAIN_COMPRESS_HISTORY.md](https://github.com/agentailor/slimcontext/blob/main/examples/LANGCHAIN_COMPRESS_HISTORY.md) for a one-call LangChain history compression helper.
 
 ## Adapters

diff --git a/docs/images/summarization-strategy.png b/docs/images/summarization-strategy.png
diff --git a/docs/images/trimming-strategy.png b/docs/images/trimming-strategy.png
diff --git a/examples/LANGCHAIN_COMPRESS_HISTORY.md b/examples/LANGCHAIN_COMPRESS_HISTORY.md
@@ -12,32 +12,79 @@ const llm = new ChatOpenAI({ model: 'gpt-5-mini', temperature: 0 });
 
 // 2) Build your existing LangChain-compatible history
 const history = [
-  new SystemMessage('You are a helpful assistant.'),
-  new HumanMessage('Hi! Help me plan a 3-day trip to Tokyo.'),
-  new AIMessage('Sure, what are your interests?'),
-  // ... many more messages
+  new SystemMessage(
+    "You are a helpful AI assistant. The user's name is Bob and he wants to plan a trip to Paris.",
+  ),
+  new HumanMessage('Hi, can you help me plan a trip?'),
+  new AIMessage('Of course, Bob! Where are you thinking of going?'),
+  new HumanMessage('I want to go to Paris.'),
+  new AIMessage('Great choice! Paris is a beautiful city. What do you want to do there?'),
+  new HumanMessage('I want to visit the Eiffel Tower and the Louvre.'),
+  new AIMessage('Those are two must-see attractions! Do you have a preferred time to visit?'),
+  new HumanMessage('I was thinking sometime in June.'),
+  new AIMessage('June is a great time to visit Paris! The weather is usually pleasant.'),
+  new HumanMessage('What about flights?'),
+  new AIMessage('I found some great flight options for you.'),
+  new HumanMessage('Can you show me the details?'),
+  new AIMessage(
+    'Here are the details for the flights I found:\n\n- Flight 1: Departing June 1st, 10:00 AM\n- Flight 2: Departing June 2nd, 2:00 PM\n- Flight 3: Departing June 3rd, 5:00 PM',
+  ),
+  new HumanMessage(
+    "I like the second flight option. I will check it out later, let's talk about hotels.",
+  ),
+  new AIMessage(
+    'The best hotel options in Paris are:\n\n- Hôtel de Ville\n- Le Meurice\n- Hôtel Plaza Athénée',
+  ),
+  // ...imagine more messages here about restaurants, activities, booking details...
+  // This is our latest message
+  new HumanMessage('Okay, can you summarize the whole plan for me in a bulleted list?'),
 ];
 
+console.log('Original size:', history.length);
+
 // 3) Compress with either summarize (default) or trim strategy
 const compact = await langchain.compressLangChainHistory(history, {
-  strategy: 'summarize',
-  llm, // pass your BaseChatModel
-  maxModelTokens: 8192,
-  thresholdPercent: 0.8,
-  minRecentMessages: 4,
+  strategy: 'summarize', // Use AI summarization strategy
+  llm, // pass your BaseChatModel for generating summaries
+  maxModelTokens: 200, // Your model's context window size
+  thresholdPercent: 0.8, // Trigger compression when 80% of tokens are used
+  minRecentMessages: 4, // Always keep the last 4 messages untouched
 });
 
 // Alternatively, use trimming without an LLM:
 const trimmed = await langchain.compressLangChainHistory(history, {
-  strategy: 'trim',
-  maxModelTokens: 8192,
-  thresholdPercent: 0.8,
-  minRecentMessages: 4,
+  strategy: 'trim', // Simple trimming strategy (no AI needed)
+  maxModelTokens: 200, // Your model's context window size
+  thresholdPercent: 0.8, // Trigger compression when 80% of tokens are used
+  minRecentMessages: 4, // Always keep the last 4 messages untouched
 });
 
-console.log('Original size:', history.length);
 console.log('Summarized size:', compact.length);
 console.log('Trimmed size:', trimmed.length);
+console.log('Compressed messages:', compact);
+```
+
+After running this, you'll have something like this as output (for the summarization):
+
+```ts
+[
+  new SystemMessage(
+    "You are a helpful AI assistant. The user's name is Bob and he wants to plan a trip to Paris.",
+  ),
+  new SystemMessage(
+    'User (Bob) wants help planning a trip to Paris in June to visit the Eiffel Tower and the Louvre.\nAssistant confirmed June is a good time and reported finding flight options.\nUser asked the assistant to show the flight details.',
+  ),
+  new AIMessage(
+    'Here are the details for the flights I found:\n\n- Flight 1: Departing June 1st, 10:00 AM\n- Flight 2: Departing June 2nd, 2:00 PM\n- Flight 3: Departing June 3rd, 5:00 PM',
+  ),
+  new HumanMessage(
+    "I like the second flight option. I will check it out later, let's talk about hotels.",
+  ),
+  new AIMessage(
+    'The best hotel options in Paris are:\n\n- Hôtel de Ville\n- Le Meurice\n- Hôtel Plaza Athénée',
+  ),
+  new HumanMessage('Okay, can you summarize the whole plan for me in a bulleted list?'),
+];
 ```
 
 Notes

diff --git a/examples/LANGCHAIN_EXAMPLE.md b/examples/LANGCHAIN_EXAMPLE.md