From d68344fd78bcbd2cd2a556e528bae9316f2c1bbc Mon Sep 17 00:00:00 2001 From: Ali Ibrahim Jr <48456829+IBJunior@users.noreply.github.com> Date: Fri, 29 Aug 2025 11:34:39 +0200 Subject: [PATCH 1/6] feat: token-threshold strategies + shared utilities; docs/examples updated; version 2.2.0 --- .github/copilot-instructions.md | 50 ++++---- CHANGELOG.md | 28 ++++- README.md | 53 +++++--- examples/LANGCHAIN_COMPRESS_HISTORY.md | 10 +- examples/LANGCHAIN_EXAMPLE.md | 7 +- examples/OPENAI_EXAMPLE.md | 7 +- package.json | 4 +- src/adapters/langchain.ts | 15 ++- src/interfaces.ts | 15 +++ src/strategies/common.ts | 48 ++++++++ src/strategies/summarize.ts | 164 +++++++++++-------------- src/strategies/trim.ts | 68 +++++++--- tests/langchain.test.ts | 18 ++- tests/summarize.test.ts | 117 ++++-------------- tests/trim.test.ts | 29 +++-- 15 files changed, 362 insertions(+), 271 deletions(-) create mode 100644 src/strategies/common.ts diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index a8825aa..58d10a6 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -7,8 +7,6 @@ ### High-Level Repository Information - **Type**: TypeScript npm library/package -- **Source Files**: ~24 -- **Installed Size**: ~106MB (including all dependencies in node_modules) - **Languages**: TypeScript (primary), JavaScript (compiled output) - **Target Runtime**: Node.js (CommonJS modules) - **Framework**: Model-agnostic core; optional adapters (LangChain) @@ -21,7 +19,7 @@ ### Prerequisites and Environment Setup -- **Node.js**: Version 20+ (as specified in the `engines` field of package.json) +- **Node.js**: Version 20+ (recommended) - **Package Manager**: pnpm 10.14.0 (preferred) or npm (fallback) ### Critical Build Steps (Always Follow This Order) @@ -41,42 +39,39 @@ 2. **Build the Project** ```bash - npm run build + pnpm run build # Compiles TypeScript to dist/ directory - # Duration: ~5-10 seconds ``` 3. **Run Tests** ```bash - npm run test + pnpm run test # Runs all vitest tests - # Duration: ~5-10 seconds - # Should show: "Test Files 3 passed (3), Tests 16 passed (16)" + # All tests should pass ``` 4. **Format Code** ```bash - npm run format # Auto-format code - npm run format:check # Check formatting without changes + pnpm run format # Auto-format code + pnpm run format:check # Check formatting without changes ``` 5. **Lint Code (Known Issue)** ```bash - npm run lint + pnpm run lint ``` - **KNOWN ISSUE**: ESLint currently fails with "parserOptions.tsconfigRootDir must be an absolute path" error. This is a configuration bug but doesn't affect build or tests. The code itself is properly linted in CI environment. ### Complete Development Workflow ```bash # Clean start (recommended for agents): rm -rf node_modules dist -npm install # Always use npm for reliability -npm run test # Verify tests pass -npm run format:check # Verify formatting -npm run build # Final build +pnpm install # Always use pnpm for reliability +pnpm run test # Verify tests pass +pnpm run format:check # Verify formatting +pnpm run build # Final build ``` ### CI/CD Pipeline Validation @@ -103,8 +98,9 @@ The repository uses GitHub Actions CI that runs: │ ├── adapters/ # Integration adapters (optional) │ │ └── langchain.ts # LangChain adapter + helpers (compressLangChainHistory, toSlimModel) │ └── strategies/ # Compression strategy implementations -│ ├── trim.ts # TrimCompressor: keeps first + last N messages -│ └── summarize.ts # SummarizeCompressor: AI-powered summarization +│ ├── common.ts # Shared token-budget utilities & defaults (thresholds, estimator) +│ ├── trim.ts # TrimCompressor: token-threshold trimming (preserve system + recent) +│ └── summarize.ts # SummarizeCompressor: token-threshold summarization (inject summary) ├── tests/ # vitest test files │ ├── trim.test.ts # Tests for TrimCompressor │ ├── summarize.test.ts # Tests for SummarizeCompressor @@ -132,11 +128,19 @@ The repository uses GitHub Actions CI that runs: - `SlimContextMessage`: Standard message format with role ('system'|'user'|'assistant'|'tool'|'human') and content - `SlimContextChatModel`: BYOM interface requiring only `invoke(messages) -> response` - `SlimContextCompressor`: Strategy interface for compression implementations +- `TokenEstimator`: `(message) => number` callback used for model-agnostic token budgeting **Compression Strategies**: -- **TrimCompressor**: Simple strategy keeping first (system) message + last N-1 messages -- **SummarizeCompressor**: AI-powered strategy that summarizes middle conversations when exceeding maxMessages +- Token-threshold based design using the model’s max token window and a configurable threshold (default 70%). +- Shared config shape (TokenBudgetConfig): `{ maxModelTokens?, thresholdPercent?, estimateTokens?, minRecentMessages? }`. +- **TrimCompressor**: Drops the oldest non-system messages until estimated tokens fall below the threshold, while always preserving any system message(s) and at least the most recent `minRecentMessages`. +- **SummarizeCompressor**: When over threshold, summarizes all messages before the recent tail (excluding the leading system message if present) and inserts a synthetic system summary just before the preserved recent messages. + +**Shared Utilities** (src/strategies/common.ts): + +- Defaults: `DEFAULT_MAX_MODEL_TOKENS = 8192`, `DEFAULT_THRESHOLD_PERCENT = 0.7`, `DEFAULT_MIN_RECENT_MESSAGES = 2`. +- Estimator: `DEFAULT_ESTIMATOR` (~`len/4 + 2`) plus `computeThresholdTokens`, `normalizeBudgetConfig` helpers. **Framework Independence**: Core library has no framework dependencies. An optional LangChain adapter is provided for convenience; core remains BYOM. @@ -153,8 +157,8 @@ The repository uses GitHub Actions CI that runs: ### Running Tests ```bash -npm run test -# Expects: ~16 tests across 3 files, all passing +pnpm run test +# Expects: All tests to pass # Tests cover TrimCompressor, SummarizeCompressor, and the LangChain adapter/helper ``` @@ -196,7 +200,7 @@ npm run test - CommonJS: `const { langchain } = require('slimcontext')` - ESM/TypeScript: `import * as slim from 'slimcontext'; const { langchain } = slim;` - Note: `import { langchain } from 'slimcontext'` may not work in all environments due to CJS/ESM interop. Prefer one of the patterns above. - - Includes a one-call history helper: `compressLangChainHistory(history, options)` + - Includes a one-call history helper: `compressLangChainHistory(history, options)` where `options` accepts the token-threshold fields (`maxModelTokens`, `thresholdPercent`, `estimateTokens`, `minRecentMessages`) and either `strategy: 'trim'` or `strategy: 'summarize'` with `llm` for the latter. --- diff --git a/CHANGELOG.md b/CHANGELOG.md index e1cbfec..714473d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -26,6 +26,32 @@ The format is based on Keep a Changelog, and this project adheres to Semantic Ve - The adapter treats LangChain `tool` messages as `assistant` during compression. - `@langchain/core` is an optional peer dependency; only needed if you use the adapter. +## [2.2.0] - 2025-08-28 + +### Breaking + +- Strategies are now token-threshold based instead of message-count based. + - `TrimCompressor({ messagesToKeep })` replaced by `TrimCompressor({ maxModelTokens?, thresholdPercent?, estimateTokens?, minRecentMessages? })`. + - `SummarizeCompressor({ model, maxMessages, ... })` replaced by `SummarizeCompressor({ model, maxModelTokens?, thresholdPercent?, estimateTokens?, minRecentMessages?, prompt? })`. + +### Migration + +- Provide your model’s context window via `maxModelTokens` (optional; defaults to 8192). +- Choose a `thresholdPercent` (0–1) at which to trigger compression (default 0.7; recommended 0.8–0.9 for aggressive usage). +- Optional: pass a custom `estimateTokens` to better approximate token usage. +- Optional: tune `minRecentMessages` (trim: default 2, summarize: default 4). +- Update adapter/example usages accordingly (README and examples have been updated). + +### Changed + +- Trim: when total estimated tokens exceed threshold, drop oldest non-system messages until under threshold, preserving system messages and the most recent messages. +- Summarize: when over threshold, summarize the oldest portion (excluding a leading system message) and insert a synthetic system summary before recent messages. + +### Added + +- `TokenEstimator` type for custom token estimation. +- Docs and examples updated to reflect token-based configuration. + ## [2.0.0] - 2025-08-24 ### Breaking @@ -41,7 +67,7 @@ The format is based on Keep a Changelog, and this project adheres to Semantic Ve - `IChatModel` -> `SlimContextChatModel` - `ICompressor` -> `SlimContextCompressor` -Migration notes: +### Migration - Import and use `SlimContextMessage` everywhere you previously used `Message` or `BaseMessage`. - Update any custom `IChatModel` implementations to accept `SlimContextMessage[]`. diff --git a/README.md b/README.md index 735ac96..845b433 100644 --- a/README.md +++ b/README.md @@ -11,8 +11,8 @@ Lightweight, model-agnostic chat history compression utilities for AI assistants ## Features -- Trim strategy: keep the first (system) message and last N messages. -- Summarize strategy: summarize the middle portion using your own chat model. +- Trim strategy: token-aware trimming based on your model's max tokens and a threshold. +- Summarize strategy: token-aware summarization of older messages using your own chat model. - Framework agnostic: plug in any model wrapper implementing a minimal `invoke()` interface. - Optional LangChain adapter with a one-call helper for compressing histories. @@ -22,6 +22,12 @@ Lightweight, model-agnostic chat history compression utilities for AI assistants npm install slimcontext ``` +## Migration + +Upgrading from an earlier version? See the Migration notes in the changelog: + +- CHANGELOG: ./CHANGELOG.md#migration + ## Core Concepts Provide a model that implements: @@ -55,7 +61,15 @@ interface SlimContextMessage { ```ts import { TrimCompressor, SlimContextMessage } from 'slimcontext'; -const compressor = new TrimCompressor({ messagesToKeep: 8 }); +// Configure token-aware trimming +const compressor = new TrimCompressor({ + // Optional: defaults shown + maxModelTokens: 8192, // your model's context window + thresholdPercent: 0.7, // start trimming after 70% of maxModelTokens + minRecentMessages: 2, // always keep at least last 2 messages + // Optional estimator; default is a len/4 heuristic + // estimateTokens: (m) => yourCustomTokenCounter(m), +}); let history: SlimContextMessage[] = [ { role: 'system', content: 'You are a helpful assistant.' }, @@ -84,7 +98,15 @@ class MyModel implements SlimContextChatModel { } const model = new MyModel(); -const compressor = new SummarizeCompressor({ model, maxMessages: 12 }); +const compressor = new SummarizeCompressor({ + model, + // Optional: defaults shown + maxModelTokens: 8192, + thresholdPercent: 0.7, // summarize once total tokens exceed 70% + minRecentMessages: 4, // keep at least last 4 messages verbatim + // estimateTokens: (m) => yourCustomTokenCounter(m), + // prompt: '...custom summarization instructions...' +}); let history: SlimContextMessage[] = [ { role: 'system', content: 'You are a helpful assistant.' }, @@ -96,21 +118,12 @@ history = await compressor.compress(history); Notes about summarization behavior -- Alignment: after compression, messages will start with `[system, summary, ...]`, and the first kept message after the summary is always a `user` turn. This preserves dialogue consistency. -- Size: to keep this alignment and preserve recency, the output length can be `maxMessages - 1`, `maxMessages`, or `maxMessages + 1`. - - Preference: if the default split lands on an assistant, we first try shifting forward by 1 (staying within `maxMessages`). If that still isn’t a user, we shift backward by 1 (allowing `maxMessages + 1`). +- When the estimated total tokens exceed the threshold, the oldest portion (excluding a leading system message) is summarized into a single system message inserted before the recent tail. +- The most recent `minRecentMessages` are always preserved verbatim. ### Strategy Combination Example -You can chain strategies depending on size thresholds: - -```ts -if (history.length > 50) { - history = await summarizeCompressor.compress(history); -} else if (history.length > 25) { - history = await trimCompressor.compress(history); -} -``` +You can chain strategies depending on token thresholds or other heuristics. ## Example Integration @@ -151,7 +164,9 @@ const history = [ const compact = await langchain.compressLangChainHistory(history, { strategy: 'summarize', llm: lc, // BaseChatModel - maxMessages: 12, + maxModelTokens: 8192, + thresholdPercent: 0.8, // summarize beyond 80% of context window + minRecentMessages: 4, }); ``` @@ -161,8 +176,8 @@ See `examples/LANGCHAIN_COMPRESS_HISTORY.md` for a fuller copy-paste example. ### Classes -- `TrimCompressor({ messagesToKeep })` -- `SummarizeCompressor({ model, maxMessages, prompt? })` +- `TrimCompressor({ maxModelTokens?, thresholdPercent?, estimateTokens?, minRecentMessages? })` +- `SummarizeCompressor({ model, maxModelTokens?, thresholdPercent?, estimateTokens?, minRecentMessages?, prompt? })` ### Interfaces diff --git a/examples/LANGCHAIN_COMPRESS_HISTORY.md b/examples/LANGCHAIN_COMPRESS_HISTORY.md index ab5482b..69b56a6 100644 --- a/examples/LANGCHAIN_COMPRESS_HISTORY.md +++ b/examples/LANGCHAIN_COMPRESS_HISTORY.md @@ -22,13 +22,17 @@ const history = [ const compact = await langchain.compressLangChainHistory(history, { strategy: 'summarize', llm, // pass your BaseChatModel - maxMessages: 12, // target total messages after compression (system + summary + recent) + maxModelTokens: 8192, + thresholdPercent: 0.8, + minRecentMessages: 4, }); // Alternatively, use trimming without an LLM: const trimmed = await langchain.compressLangChainHistory(history, { strategy: 'trim', - messagesToKeep: 8, + maxModelTokens: 8192, + thresholdPercent: 0.8, + minRecentMessages: 4, }); console.log('Original size:', history.length); @@ -39,4 +43,4 @@ console.log('Trimmed size:', trimmed.length); Notes - `@langchain/core` is an optional peer dependency. Install it only if you use the adapter. -- `maxMessages` must be at least 4 for summarize (system + summary + 2 recent). +- Summarize strategy summarizes older content when total tokens exceed `thresholdPercent * maxModelTokens`. diff --git a/examples/LANGCHAIN_EXAMPLE.md b/examples/LANGCHAIN_EXAMPLE.md index a47e277..b5e9dbf 100644 --- a/examples/LANGCHAIN_EXAMPLE.md +++ b/examples/LANGCHAIN_EXAMPLE.md @@ -37,7 +37,12 @@ class LangChainModel implements SlimContextChatModel { } async function compress(history: SlimContextMessage[]) { - const summarize = new SummarizeCompressor({ model: new LangChainModel(), maxMessages: 12 }); + const summarize = new SummarizeCompressor({ + model: new LangChainModel(), + maxModelTokens: 8192, + thresholdPercent: 0.75, + minRecentMessages: 4, + }); return summarize.compress(history); } diff --git a/examples/OPENAI_EXAMPLE.md b/examples/OPENAI_EXAMPLE.md index 043f68c..467d67e 100644 --- a/examples/OPENAI_EXAMPLE.md +++ b/examples/OPENAI_EXAMPLE.md @@ -33,7 +33,12 @@ async function main() { // ... conversation grows ]; - const summarize = new SummarizeCompressor({ model: new OpenAIModel(), maxMessages: 10 }); + const summarize = new SummarizeCompressor({ + model: new OpenAIModel(), + maxModelTokens: 128000, + thresholdPercent: 0.8, + minRecentMessages: 4, + }); const compressed = await summarize.compress(history); const completion = await client.chat.completions.create({ diff --git a/package.json b/package.json index 000928b..f06d292 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "slimcontext", - "version": "2.1.0", + "version": "2.2.0", "description": "Lightweight, model-agnostic chat history compression (trim + summarize) for AI assistants.", "main": "dist/index.js", "types": "dist/index.d.ts", @@ -13,7 +13,7 @@ ], "scripts": { "build": "tsc", - "prepare": "npm run build", + "prepare": "pnpm run build", "test": "vitest run", "test:watch": "vitest", "lint": "eslint . --ext .ts,.tsx --max-warnings=0", diff --git a/src/adapters/langchain.ts b/src/adapters/langchain.ts index 0d3c4c6..ff97503 100644 --- a/src/adapters/langchain.ts +++ b/src/adapters/langchain.ts @@ -133,7 +133,7 @@ export function toSlimModel(llm: BaseChatModel): SlimContextChatModel { return new LangChainSlimModel(llm); } -/** Convenience: build a SummarizeCompressor for LangChain models. */ +/** Convenience: build a SummarizeCompressor for LangChain models (token-threshold based). */ export function createSummarizeCompressorForLangChain( llm: BaseChatModel, config: Omit, @@ -141,11 +141,19 @@ export function createSummarizeCompressorForLangChain( return new SummarizeCompressor({ model: toSlimModel(llm), ...config }); } -/** Convenience: build a TrimCompressor. */ +/** Convenience: build a TrimCompressor (token-threshold based). */ export function createTrimCompressor(config: TrimConfig): TrimCompressor { return new TrimCompressor(config); } +/** + * Options for compressLangChainHistory (token-threshold based). + * + * Provide one of: + * - { compressor }: a pre-built SlimContextCompressor instance + * - summarize: { strategy?: 'summarize', llm, maxModelTokens?, thresholdPercent?, estimateTokens?, minRecentMessages?, prompt? } + * - trim: { strategy: 'trim', maxModelTokens?, thresholdPercent?, estimateTokens?, minRecentMessages? } + */ export type CompressLangChainOptions = | { compressor: SlimContextCompressor } | ({ @@ -157,6 +165,9 @@ export type CompressLangChainOptions = /** * High-level helper: compress a LangChain message history in one call. * - Converts LC -> SlimContext, runs a compressor, and converts the result back. + * - Strategies trigger when estimated total tokens exceed `thresholdPercent * maxModelTokens`. + * - For summarize, older content is summarized and a system summary is inserted before recent messages. + * - For trim, oldest non-system messages are dropped until under threshold, preserving system + recent. */ export async function compressLangChainHistory( history: BaseMessage[], diff --git a/src/interfaces.ts b/src/interfaces.ts index 66ad539..1299397 100644 --- a/src/interfaces.ts +++ b/src/interfaces.ts @@ -18,3 +18,18 @@ export interface SlimContextChatModel { export interface SlimContextCompressor { compress(messages: SlimContextMessage[]): Promise; } + +export interface TokenBudgetConfig { + /** Model's maximum token context window. Default: 8192. */ + maxModelTokens?: number; + /** Percentage threshold to trigger compression (0-1). Default: 0.7. */ + thresholdPercent?: number; + /** Custom token estimator for messages. Default: len/4 heuristic. */ + estimateTokens?: TokenEstimator; + /** Minimum recent messages to always preserve. Strategy-specific default. */ + minRecentMessages?: number; +} + +// Token estimation callback for model-agnostic budgeting. +// Return an estimated token count for a single message. +export type TokenEstimator = (message: SlimContextMessage) => number; diff --git a/src/strategies/common.ts b/src/strategies/common.ts new file mode 100644 index 0000000..029e780 --- /dev/null +++ b/src/strategies/common.ts @@ -0,0 +1,48 @@ +import type { SlimContextMessage, TokenBudgetConfig, TokenEstimator } from '../interfaces'; + +// Default constants for token budgeting +export const DEFAULT_MAX_MODEL_TOKENS = 8192; +export const DEFAULT_THRESHOLD_PERCENT = 0.7; // 70% +export const DEFAULT_MIN_RECENT_MESSAGES = 2; // strategy-specific override allowed +export const DEFAULT_ESTIMATOR_TOKEN_BIAS = 2; + +/** Common token-budget fields shared by strategies. */ +export interface NormalizedBudgetConfig { + maxModelTokens: number; + thresholdPercent: number; + estimateTokens: TokenEstimator; + minRecentMessages: number; +} +/** Default token estimator: rough approximation len/4 + 2. */ +export const DEFAULT_ESTIMATOR: TokenEstimator = (m: SlimContextMessage) => + Math.ceil(m.content.length / 4) + DEFAULT_ESTIMATOR_TOKEN_BIAS; + +/** Normalize token-budget config with strategy-specific defaults. */ +export function normalizeBudgetConfig( + config: TokenBudgetConfig, + options?: { minRecentDefault?: number }, +): NormalizedBudgetConfig { + const minRecentDefault = options?.minRecentDefault ?? DEFAULT_MIN_RECENT_MESSAGES; + return { + maxModelTokens: config.maxModelTokens ?? DEFAULT_MAX_MODEL_TOKENS, + thresholdPercent: config.thresholdPercent ?? DEFAULT_THRESHOLD_PERCENT, + estimateTokens: config.estimateTokens ?? DEFAULT_ESTIMATOR, + minRecentMessages: Math.max(0, config.minRecentMessages ?? minRecentDefault), + }; +} + +/** Compute threshold token budget. */ +export function computeThresholdTokens(maxModelTokens: number, thresholdPercent: number): number { + return Math.floor(maxModelTokens * thresholdPercent); +} + +/** Estimate the total tokens for an array of messages. */ +export function estimateTotalTokens( + messages: SlimContextMessage[], + estimateTokens: TokenEstimator, +): number { + if (messages.length === 0) return 0; + let total = 0; + for (const m of messages) total += estimateTokens(m); + return total; +} diff --git a/src/strategies/summarize.ts b/src/strategies/summarize.ts index 987e6fa..869e878 100644 --- a/src/strategies/summarize.ts +++ b/src/strategies/summarize.ts @@ -1,58 +1,71 @@ -import { SlimContextCompressor, SlimContextChatModel, SlimContextMessage } from '../interfaces'; +import { + SlimContextCompressor, + SlimContextChatModel, + SlimContextMessage, + type TokenBudgetConfig, +} from '../interfaces'; +import { normalizeBudgetConfig, computeThresholdTokens } from './common'; const DEFAULT_SUMMARY_PROMPT = ` -You are an expert conversation summarizer. You'll receive an excerpt of a chat transcript to condense. - -Goals: -- Be concise while retaining key facts, entities, user intent, decisions, follow-ups, and resolutions. -- Preserve important numbers, dates, IDs (truncate if long), and constraints. - -When tool messages are present (role: tool or similar): -- Briefly note which tool(s) were called, why (the user/assistant intent), and the high-level outcome. -- Do NOT copy raw JSON, logs, or code. Extract only salient fields (e.g., status, count/total, top IDs, amounts, dates, error messages). -- If outputs are very long, compress to 1–2 sentences. Truncate long IDs (e.g., abc…123) and omit secrets. -- If multiple tools were called for the same purpose, summarize them together. -- If a tool failed or contradicted prior assumptions, note the discrepancy. - -Output format: -- Output only the summary as a single concise paragraph (2–5 sentences). No preface, no headings. - -Examples: -Input (excerpt): -user: Please find docs about OAuth token errors in our KB -assistant: I will search the knowledge base -assistant: calling search_kb with query "OAuth token expired" -tool: { "results": [ { "title": "Token expired", "fix": "Refresh or sync clock" }, { "title": "Clock skew", "fix": "NTP sync" } ] } -assistant: The docs suggest refreshing tokens and checking clock skew - -Summary: -User requested guidance on OAuth token errors. Assistant searched the KB; the tool returned articles about token expiration and clock skew. Assistant advised refreshing tokens and ensuring time sync. +You are a conversation summarizer. +You will receive a transcript of a conversation in the following format: + + +user : user message +assistant : assistant message +... + + +Your task is to produce a concise summary of the conversation that can be used as a system message for continuing the dialogue. + +Guidelines: + +- Capture all important facts, decisions, user goals, and assistant outputs. + +- Preserve any constraints, preferences, or instructions given by the user. + +- Omit small talk, filler, or irrelevant details. + +- Be concise, but include enough information so the assistant can seamlessly continue the conversation without the full transcript. + +- Write the summary in neutral, factual style (not conversational). + +Output format (only the summary, no additional text): + +Example: +Input transcript: +user : I want to build an AI agent in TypeScript that can search Google and store notes in Notion. +assistant : You could use LangGraph.js with a Google Search tool and a Notion connector. Do you want me to scaffold an example? +user : Yes, but make it simple first without authentication. +assistant : Sure, I’ll prepare a minimal scaffold with those two tools integrated. + +Output: +The user wants an AI agent in TypeScript using LangGraph.js with Google Search and Notion integration. +They prefer a simple scaffold without authentication. +The assistant suggested creating an example, and the user agreed. + `; -export interface SummarizeConfig { +export interface SummarizeConfig extends TokenBudgetConfig { model: SlimContextChatModel; - maxMessages: number; // total messages desired after compression (including system + summary + retained recent messages) + /** Prompt used to produce the summary */ prompt?: string; } /** - * SummarizeCompressor summarizes the middle portion of the conversation when it grows beyond maxMessages. - * It keeps the original first system message, injects a synthetic summary system message, and retains - * the most recent messages up to the maxMessages budget. + * SummarizeCompressor summarizes older messages when the estimated total tokens exceed + * a configurable threshold of the model's max context window. It preserves the leading + * system message (if present), injects a synthetic system summary, and retains the most + * recent `minRecentMessages` verbatim. */ export class SummarizeCompressor implements SlimContextCompressor { private readonly model: SlimContextChatModel; - private readonly maxMessages: number; private readonly summaryPrompt: string; + private cfg: ReturnType; constructor(config: SummarizeConfig) { - if (config.maxMessages < 4) { - throw new Error( - 'maxMessages should be at least 4 to allow system + summary + 2 recent messages', - ); - } this.model = config.model; - this.maxMessages = config.maxMessages; + this.cfg = normalizeBudgetConfig(config, { minRecentDefault: 4 }); this.summaryPrompt = config.prompt || DEFAULT_SUMMARY_PROMPT; } @@ -60,20 +73,26 @@ export class SummarizeCompressor implements SlimContextCompressor { * Compress the conversation history by summarizing the middle portion. */ async compress(messages: SlimContextMessage[]): Promise { - if (messages.length <= this.maxMessages) { - return messages; - } + const thresholdTokens = computeThresholdTokens( + this.cfg.maxModelTokens, + this.cfg.thresholdPercent, + ); + const tokenCounts = messages.map((m) => this.cfg.estimateTokens(m)); + const total = tokenCounts.reduce((a, b) => a + b, 0); + + if (total <= thresholdTokens) return messages; + // We'll keep the last `minRecentMessages` untouched, and summarize everything before them + const keepTailStart = Math.max(0, messages.length - this.cfg.minRecentMessages); const hasSystemFirst = messages[0]?.role === 'system'; const systemMessage = hasSystemFirst ? messages[0] : undefined; - // Decide where the kept tail should start (ensuring it starts with a user message when possible) - const startIdx = this.computeKeepStartIndex(messages, hasSystemFirst); - const messagesToKeep = messages.slice(startIdx); - // Everything between the first system message and the slice we keep is summarized - const endOfSummarizedIndex = startIdx; // non-inclusive + // Exclude leading system from summary input; we re-insert it unchanged const summarizeStart = hasSystemFirst ? 1 : 0; - const messagesToSummarize = messages.slice(summarizeStart, endOfSummarizedIndex); + const messagesToSummarize = messages.slice(summarizeStart, keepTailStart); + + // If there is barely anything to summarize, just return messages + if (messagesToSummarize.length === 0) return messages; const conversationText = messagesToSummarize .map((msg) => `${msg.role}: ${msg.content}`) @@ -81,7 +100,7 @@ export class SummarizeCompressor implements SlimContextCompressor { const promptMessages: SlimContextMessage[] = [ { role: 'system', content: this.summaryPrompt }, - { role: 'user', content: conversationText }, + { role: 'user', content: `Input transcript: \n ${conversationText}` }, ]; const response = await this.model.invoke(promptMessages); @@ -89,49 +108,14 @@ export class SummarizeCompressor implements SlimContextCompressor { const summaryMessage: SlimContextMessage = { role: 'system', - content: `[Context from a summarized portion of the conversation between you and the user]: ${summaryText}`, + content: `${summaryText}`, }; - if (hasSystemFirst && systemMessage) { - return [systemMessage, summaryMessage, ...messagesToKeep]; - } - return [summaryMessage, ...messagesToKeep]; - } + const keptTail = messages.slice(keepTailStart); + const result: SlimContextMessage[] = []; + if (systemMessage) result.push(systemMessage); + result.push(summaryMessage, ...keptTail); - /** - * Compute the start index of the kept tail after inserting a summary. - * Default budget keeps: system + summary + (maxMessages - 2) recent messages. - * To keep conversation turn consistency, we try to ensure the first kept message is a 'user'. - * If the default split lands on a non-user, we first try shifting forward by 1 (<= maxMessages), - * otherwise we try shifting backward by 1 (allowing maxMessages + 1 total). - */ - private computeKeepStartIndex(messages: SlimContextMessage[], hasSystemFirst: boolean): number { - const reservedSlots = hasSystemFirst ? 2 : 1; // system? + summary - const baseRecentBudget = this.maxMessages - reservedSlots; - let startIdx = messages.length - baseRecentBudget; - - // Guardrails: ensure startIdx within [minStart, messages.length) - const minStart = hasSystemFirst ? 1 : 0; - if (startIdx < minStart) startIdx = minStart; - if (startIdx >= messages.length) startIdx = messages.length - 1; - - const firstKept = messages[startIdx]; - if (firstKept && firstKept.role !== 'user') { - // Try shifting forward by 1 (dropping one more from summarized middle) - if (startIdx + 1 < messages.length) { - const candidate = messages[startIdx + 1]; - if (candidate.role === 'user') { - return startIdx + 1; - } - } - // Otherwise, try shifting backward by 1 (keeping one more, allowing +1 over max) - if (startIdx - 1 >= minStart) { - const candidateBack = messages[startIdx - 1]; - if (candidateBack.role === 'user') { - return startIdx - 1; - } - } - } - return startIdx; + return result; } } diff --git a/src/strategies/trim.ts b/src/strategies/trim.ts index f1462cb..34bc124 100644 --- a/src/strategies/trim.ts +++ b/src/strategies/trim.ts @@ -1,33 +1,65 @@ -import { SlimContextCompressor, SlimContextMessage } from '../interfaces'; +import { SlimContextCompressor, SlimContextMessage, type TokenBudgetConfig } from '../interfaces'; +import { normalizeBudgetConfig, computeThresholdTokens } from './common'; /** - * Trim configuration options for the TrimCompressor (messages to keep). + * Trim configuration options for the TrimCompressor using token thresholding. */ -export interface TrimConfig { - messagesToKeep: number; -} +export type TrimConfig = TokenBudgetConfig; /** - * TrimCompressor keeps the very first message (often a system prompt) and the last N-1 messages. + * TrimCompressor drops the oldest non-system messages until the estimated token + * usage falls below the configured threshold, preserving any system messages and + * the most recent conversation turns. */ export class TrimCompressor implements SlimContextCompressor { - private readonly messagesToKeep: number; + private cfg: ReturnType; constructor(config: TrimConfig) { - if (config.messagesToKeep < 2) { - throw new Error( - 'messagesToKeep must be at least 2 to retain the first system message and one recent message', - ); - } - this.messagesToKeep = config.messagesToKeep; + this.cfg = normalizeBudgetConfig(config, { minRecentDefault: 2 }); } async compress(messages: SlimContextMessage[]): Promise { - if (messages.length <= this.messagesToKeep) { - return messages; + const thresholdTokens = computeThresholdTokens( + this.cfg.maxModelTokens, + this.cfg.thresholdPercent, + ); + + // Compute total tokens + const tokenCounts = messages.map((m) => this.cfg.estimateTokens(m)); + let total = tokenCounts.reduce((a, b) => a + b, 0); + if (total <= thresholdTokens) return messages; + + // Determine the earliest index we are allowed to drop up to, preserving recent messages + const preserveFromIndex = Math.max(0, messages.length - this.cfg.minRecentMessages); + + const keepMask = new Array(messages.length).fill(true); + + // Drop from the oldest non-system messages forward until under threshold, + // but never drop system messages or any message within the last `minRecentMessages`. + for (let i = 0; i < messages.length && total > thresholdTokens; i++) { + const msg = messages[i]; + const isRecentProtected = i >= preserveFromIndex; + const isSystem = msg.role === 'system'; + if (isRecentProtected || isSystem) continue; + // Drop it + keepMask[i] = false; + total -= tokenCounts[i]; + } + + // If still over threshold (e.g., many system messages or very long recent messages), + // continue dropping from the left side before the preserved tail, still skipping systems. + for (let i = 0; i < preserveFromIndex && total > thresholdTokens; i++) { + if (!keepMask[i]) continue; + const msg = messages[i]; + if (msg.role === 'system') continue; + keepMask[i] = false; + total -= tokenCounts[i]; + } + + const result: SlimContextMessage[] = []; + for (let i = 0; i < messages.length; i++) { + if (keepMask[i]) result.push(messages[i]); } - const systemMessage = messages[0]; - const recentMessages = messages.slice(-this.messagesToKeep + 1); - return [systemMessage, ...recentMessages]; + return result; } } diff --git a/tests/langchain.test.ts b/tests/langchain.test.ts index 8821d17..28eff3a 100644 --- a/tests/langchain.test.ts +++ b/tests/langchain.test.ts @@ -87,10 +87,14 @@ describe('LangChain Adapter', () => { it('should compress history with the trim strategy', async () => { const compressed = await compressLangChainHistory(history, { strategy: 'trim', - messagesToKeep: 3, + maxModelTokens: 400, + thresholdPercent: 0.5, + minRecentMessages: 2, + estimateTokens: () => 150, // each message ~150 tokens }); - - expect(compressed).toHaveLength(3); // System + 2 kept + // System (150) + last two messages (300) = 450 > threshold 200, but + // our TrimCompressor preserves last two regardless; will drop earlier non-systems. + expect(compressed.length).toBe(3); expect(compressed[0]).toBeInstanceOf(SystemMessage); expect(compressed[1]).toBeInstanceOf(HumanMessage); expect(compressed[1].content).toBe('Message 3'); @@ -105,11 +109,15 @@ describe('LangChain Adapter', () => { const compressed = await compressLangChainHistory(history, { strategy: 'summarize', llm: mockModel, - maxMessages: 4, + maxModelTokens: 300, + thresholdPercent: 0.5, + estimateTokens: () => 100, + minRecentMessages: 2, }); expect(invokeSpy).toHaveBeenCalled(); - expect(compressed).toHaveLength(4); // System + summary + 2 kept + // Should be: system + summary + last 2 messages + expect(compressed).toHaveLength(4); expect(compressed[0]).toBeInstanceOf(SystemMessage); expect(compressed[1]).toBeInstanceOf(SystemMessage); // Summary is an System Message expect(compressed[1].content).toContain('This is a summary of messages 1 and 2.'); diff --git a/tests/summarize.test.ts b/tests/summarize.test.ts index 26e9779..0af9abb 100644 --- a/tests/summarize.test.ts +++ b/tests/summarize.test.ts @@ -8,14 +8,20 @@ import { } from '../src'; describe('SummarizeCompressor', () => { - it('inserts a summary and respects maxMessages', async () => { + it('inserts a summary before recent messages when over token threshold', async () => { const fakeModel: SlimContextChatModel = { async invoke(_msgs: SlimContextMessage[]): Promise { return { content: 'fake summary' }; }, }; - const summarize = new SummarizeCompressor({ model: fakeModel, maxMessages: 6 }); + const summarize = new SummarizeCompressor({ + model: fakeModel, + maxModelTokens: 400, + thresholdPercent: 0.5, // 200 tokens + estimateTokens: () => 50, // each message 50 tokens + minRecentMessages: 2, + }); const history: SlimContextMessage[] = [ { role: 'system', content: 'sys' }, @@ -23,19 +29,27 @@ describe('SummarizeCompressor', () => { ]; const result = await summarize.compress(history); - expect(result.length).toBeLessThanOrEqual(6); + // Should be: system, summary, last 2 user messages expect(result[0].content).toBe('sys'); expect(result[1].content).toContain('fake summary'); + expect(result.at(-2)?.content).toBe('u8'); + expect(result.at(-1)?.content).toBe('u9'); }); - it("works when first message isn't system; only reserves summary", async () => { + it("works when first message isn't system; only adds summary before recent messages", async () => { const fakeModel: SlimContextChatModel = { async invoke(_msgs: SlimContextMessage[]): Promise { return { content: 'fake summary' }; }, }; - const summarize = new SummarizeCompressor({ model: fakeModel, maxMessages: 6 }); + const summarize = new SummarizeCompressor({ + model: fakeModel, + maxModelTokens: 300, + thresholdPercent: 0.5, + estimateTokens: () => 50, + minRecentMessages: 2, + }); // Start with user instead of system, then alternate and end with user const history: SlimContextMessage[] = []; @@ -45,98 +59,9 @@ describe('SummarizeCompressor', () => { } const result = await summarize.compress(history); - expect(result.length).toBeLessThanOrEqual(6); expect(result[0].role).toBe('system'); // summary system (no original system preserved) expect(result[0].content).toContain('fake summary'); - expect(result[1].role).toBe('user'); // first kept remains aligned to user - }); -}); - -describe('SummarizeCompressor split alignment', () => { - const fakeModel: SlimContextChatModel = { - async invoke(_msgs: SlimContextMessage[]): Promise { - return { content: 'fake summary' }; - }, - }; - - it('shifts forward by 1 so the first kept message is a user (<= maxMessages)', async () => { - // Use a strictly alternating conversation ending with user. - // For maxMessages = 6 => baseRecentBudget = 4 => startIdx = len - 4. - // With len = 10, startIdx = 6 (assistant), so forward shift to 7 (user). - const summarize = new SummarizeCompressor({ model: fakeModel, maxMessages: 6 }); - const history: SlimContextMessage[] = [ - { role: 'system', content: 'sys' }, // 0 - { role: 'user', content: 'u1' }, // 1 - { role: 'assistant', content: 'a1' }, // 2 - { role: 'user', content: 'u2' }, // 3 - { role: 'assistant', content: 'a2' }, // 4 - { role: 'user', content: 'u3' }, // 5 - { role: 'assistant', content: 'a3' }, // 6 <- base startIdx (assistant) - { role: 'user', content: 'u4' }, // 7 <- candidate forward (user) - { role: 'assistant', content: 'a4' }, // 8 - { role: 'user', content: 'u5' }, // 9 (ends with user) - ]; // len = 10 - - const result = await summarize.compress(history); - // After system + summary, the first kept should be a user - expect(result[2].role).toBe('user'); - // Forward shift reduces total by 1 - expect(result.length).toBe(5); // maxMessages - 1 - }); - - it('shifts backward by 1 when forward is not user, allowing maxMessages + 1', async () => { - // maxMessages = 6 => baseRecentBudget = 4 => startIdx = len - 4 - const summarize = new SummarizeCompressor({ model: fakeModel, maxMessages: 6 }); - const history: SlimContextMessage[] = [ - { role: 'system', content: 'sys' }, // 0 - { role: 'user', content: 'u1' }, // 1 - { role: 'assistant', content: 'a1' }, // 2 - { role: 'user', content: 'u2' }, // 3 - { role: 'user', content: 'u2b' }, // 4 <- candidate backward (user) - { role: 'assistant', content: 'a3' }, // 5 <- base startIdx (assistant) - { role: 'assistant', content: 'a4' }, // 6 <- candidate forward (assistant) - { role: 'user', content: 'u3' }, // 7 - { role: 'assistant', content: 'a5' }, // 8 - ]; // len = 9, startIdx = 5 - - const result = await summarize.compress(history); - // After system + summary, the first kept should be a user (from index 4) - expect(result[2].role).toBe('user'); - // Backward shift increases total by 1 - expect(result.length).toBe(7); // maxMessages + 1 - }); - - it('ensures first kept message is user for alternating history (maxMessages=12)', async () => { - const summarize = new SummarizeCompressor({ model: fakeModel, maxMessages: 12 }); - - // Build an alternating conversation: system, user, assistant, user, ... ending with user - const history: SlimContextMessage[] = [{ role: 'system', content: 'sys' }]; - for (let i = 1; i <= 25; i++) { - const role = i % 2 === 1 ? 'user' : 'assistant'; - history.push({ role: role as 'user' | 'assistant', content: `${role[0]}${i}` }); - } - - const result = await summarize.compress(history); - expect(result[0].role).toBe('system'); // original system - expect(result[1].role).toBe('system'); // summary system - expect(result[2].role).toBe('user'); // first kept must be user - // For alternating history ending with user: total becomes maxMessages - 1 (11) - expect(result.length).toBe(11); - }); - - it('keeps exactly maxMessages when base split already lands on user', async () => { - const summarize = new SummarizeCompressor({ model: fakeModel, maxMessages: 12 }); - // Construct length so startIdx = len - (12-2) = len - 10 is odd (user at that index) - // Let len = 27 => startIdx = 17 (odd). Build alternating ending with user. - const history: SlimContextMessage[] = [{ role: 'system', content: 'sys' }]; - for (let i = 1; i <= 26; i++) { - const role = i % 2 === 1 ? 'user' : 'assistant'; - history.push({ role: role as 'user' | 'assistant', content: `${role[0]}${i}` }); - } - const result = await summarize.compress(history); - expect(result[0].role).toBe('system'); - expect(result[1].role).toBe('system'); - expect(result[2].role).toBe('user'); - expect(result.length).toBe(12); + expect(result.at(-2)?.role).toBe('user'); + expect(result.at(-1)?.role).toBe('assistant'); }); }); diff --git a/tests/trim.test.ts b/tests/trim.test.ts index c4a4388..aaa33b1 100644 --- a/tests/trim.test.ts +++ b/tests/trim.test.ts @@ -3,20 +3,29 @@ import { describe, it, expect } from 'vitest'; import { TrimCompressor, SlimContextMessage } from '../src'; describe('TrimCompressor', () => { - it('keeps first system and last N-1 messages', async () => { - const trim = new TrimCompressor({ messagesToKeep: 5 }); + it('drops oldest non-system messages until under threshold (preserves system + recent)', async () => { + const estimate = (_m: SlimContextMessage) => 100; // deterministic + const trim = new TrimCompressor({ + maxModelTokens: 400, + thresholdPercent: 0.5, // threshold = 200 + estimateTokens: estimate, + minRecentMessages: 2, + }); const history: SlimContextMessage[] = [ - { role: 'system', content: 'sys' }, - { role: 'user', content: 'u1' }, - { role: 'assistant', content: 'a1' }, - { role: 'user', content: 'u2' }, - { role: 'assistant', content: 'a2' }, - { role: 'user', content: 'u3' }, - ]; + { role: 'system', content: 'sys' }, // 100 + { role: 'user', content: 'u1' }, // 100 + { role: 'assistant', content: 'a1' }, // 100 + { role: 'user', content: 'u2' }, // 100 + { role: 'assistant', content: 'a2' }, // 100 + { role: 'user', content: 'u3' }, // 100 + ]; // total 600 > threshold 200 const trimmed = await trim.compress(history); - expect(trimmed.length).toBe(5); + // We expect to preserve the system and last 2 messages when possible expect(trimmed[0]).toEqual({ role: 'system', content: 'sys' }); + expect(trimmed.at(-2)).toEqual({ role: 'assistant', content: 'a2' }); expect(trimmed.at(-1)).toEqual({ role: 'user', content: 'u3' }); + // Older non-system messages should be dropped + expect(trimmed.length).toBe(3); }); }); From 362de7cfdffc505212bae3e94fb485c97c46a4f6 Mon Sep 17 00:00:00 2001 From: Ali Ibrahim Jr <48456829+IBJunior@users.noreply.github.com> Date: Fri, 29 Aug 2025 11:49:50 +0200 Subject: [PATCH 2/6] fix: addressed PR issue for prepare script --- package.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/package.json b/package.json index f06d292..d0a1957 100644 --- a/package.json +++ b/package.json @@ -13,7 +13,7 @@ ], "scripts": { "build": "tsc", - "prepare": "pnpm run build", + "prepare": "npm run build", "test": "vitest run", "test:watch": "vitest", "lint": "eslint . --ext .ts,.tsx --max-warnings=0", From f141187e1c1a00580f60a3d070ba752ce250316b Mon Sep 17 00:00:00 2001 From: Ali Ibrahim Jr <48456829+IBJunior@users.noreply.github.com> Date: Fri, 29 Aug 2025 11:58:10 +0200 Subject: [PATCH 3/6] updated DEFAULT_MIN_RECENT_MESSAGES default value --- src/strategies/common.ts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/strategies/common.ts b/src/strategies/common.ts index 029e780..854f7e4 100644 --- a/src/strategies/common.ts +++ b/src/strategies/common.ts @@ -3,7 +3,7 @@ import type { SlimContextMessage, TokenBudgetConfig, TokenEstimator } from '../i // Default constants for token budgeting export const DEFAULT_MAX_MODEL_TOKENS = 8192; export const DEFAULT_THRESHOLD_PERCENT = 0.7; // 70% -export const DEFAULT_MIN_RECENT_MESSAGES = 2; // strategy-specific override allowed +export const DEFAULT_MIN_RECENT_MESSAGES = 10; // strategy-specific override allowed export const DEFAULT_ESTIMATOR_TOKEN_BIAS = 2; /** Common token-budget fields shared by strategies. */ From 1750d911a0cb7aab56740168666e6d82ff1541ec Mon Sep 17 00:00:00 2001 From: Ali Ibrahim Jr <48456829+IBJunior@users.noreply.github.com> Date: Fri, 29 Aug 2025 12:00:19 +0200 Subject: [PATCH 4/6] updated example model to gpt-5-mini --- README.md | 2 +- examples/LANGCHAIN_COMPRESS_HISTORY.md | 2 +- examples/LANGCHAIN_EXAMPLE.md | 2 +- examples/OPENAI_EXAMPLE.md | 4 ++-- 4 files changed, 5 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 845b433..8eb13ad 100644 --- a/README.md +++ b/README.md @@ -152,7 +152,7 @@ import { AIMessage, HumanMessage, SystemMessage } from '@langchain/core/messages import { ChatOpenAI } from '@langchain/openai'; import { langchain } from 'slimcontext'; -const lc = new ChatOpenAI({ model: 'gpt-4o-mini', temperature: 0 }); +const lc = new ChatOpenAI({ model: 'gpt-5-mini', temperature: 0 }); const history = [ new SystemMessage('You are helpful.'), diff --git a/examples/LANGCHAIN_COMPRESS_HISTORY.md b/examples/LANGCHAIN_COMPRESS_HISTORY.md index 69b56a6..cc0b212 100644 --- a/examples/LANGCHAIN_COMPRESS_HISTORY.md +++ b/examples/LANGCHAIN_COMPRESS_HISTORY.md @@ -8,7 +8,7 @@ import { ChatOpenAI } from '@langchain/openai'; import { langchain } from 'slimcontext'; // 1) Create your LangChain chat model (any BaseChatModel works) -const llm = new ChatOpenAI({ model: 'gpt-4o-mini', temperature: 0 }); +const llm = new ChatOpenAI({ model: 'gpt-5-mini', temperature: 0 }); // 2) Build your existing LangChain-compatible history const history = [ diff --git a/examples/LANGCHAIN_EXAMPLE.md b/examples/LANGCHAIN_EXAMPLE.md index b5e9dbf..7d1260f 100644 --- a/examples/LANGCHAIN_EXAMPLE.md +++ b/examples/LANGCHAIN_EXAMPLE.md @@ -12,7 +12,7 @@ import { import { ChatOpenAI } from '@langchain/openai'; // or any LangChain chat model // Create a LangChain model (reads from env, e.g., OPENAI_API_KEY) -const lc = new ChatOpenAI({ model: 'gpt-4o-mini', temperature: 0 }); +const lc = new ChatOpenAI({ model: 'gpt-5-mini', temperature: 0 }); class LangChainModel implements SlimContextChatModel { async invoke(messages: SlimContextMessage[]): Promise { diff --git a/examples/OPENAI_EXAMPLE.md b/examples/OPENAI_EXAMPLE.md index 467d67e..98fdb45 100644 --- a/examples/OPENAI_EXAMPLE.md +++ b/examples/OPENAI_EXAMPLE.md @@ -16,7 +16,7 @@ const client = new OpenAI(); class OpenAIModel implements SlimContextChatModel { async invoke(msgs: SlimContextMessage[]): Promise { const response = await client.chat.completions.create({ - model: 'gpt-4o-mini', + model: 'gpt-5-mini', messages: msgs.map((m) => ({ role: m.role === 'human' ? 'user' : (m.role as 'system' | 'user' | 'assistant'), content: m.content, @@ -42,7 +42,7 @@ async function main() { const compressed = await summarize.compress(history); const completion = await client.chat.completions.create({ - model: 'gpt-4o-mini', + model: 'gpt-5-mini', messages: compressed .filter((m) => m.role !== 'tool') .map((m) => ({ role: m.role as 'system' | 'user' | 'assistant', content: m.content })), From 9305f5c4823d9d057e8d28b3f208f28b7326b061 Mon Sep 17 00:00:00 2001 From: Ali Ibrahim Jr <48456829+IBJunior@users.noreply.github.com> Date: Fri, 29 Aug 2025 12:04:55 +0200 Subject: [PATCH 5/6] fix: fixed repo url --- package.json | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/package.json b/package.json index d0a1957..61c47b9 100644 --- a/package.json +++ b/package.json @@ -34,12 +34,12 @@ "license": "MIT", "repository": { "type": "git", - "url": "git+https://github.com/Agentailor/slimcontext.git" + "url": "git+https://github.com/agentailor/slimcontext.git" }, "bugs": { - "url": "https://github.com/Agentailor/slimcontext/issues" + "url": "https://github.com/agentailor/slimcontext/issues" }, - "homepage": "https://github.com/Agentailor/slimcontext#readme", + "homepage": "https://github.com/agentailor/slimcontext#readme", "packageManager": "pnpm@10.14.0", "peerDependencies": { "@langchain/core": ">=0.3.71 <1" From 066130df0149ed6b6c96a299f9012fa1cda0cbd0 Mon Sep 17 00:00:00 2001 From: Ali Ibrahim Jr <48456829+IBJunior@users.noreply.github.com> Date: Fri, 29 Aug 2025 16:24:51 +0200 Subject: [PATCH 6/6] fix: fixed version number --- CHANGELOG.md | 38 ++++++++++++++------------------------ package.json | 2 +- 2 files changed, 15 insertions(+), 25 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 714473d..6db9fae 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,29 +4,7 @@ All notable changes to this project will be documented in this file. The format is based on Keep a Changelog, and this project adheres to Semantic Versioning. -## [2.1.0] - 2025-08-27 - -### Added - -- LangChain adapter under `src/adapters/langchain.ts` with helpers: - - `extractContent`, `roleFromMessageType`, `baseToSlim`, `slimToLangChain` - - `toSlimModel(llm)` wrapper to use LangChain `BaseChatModel` with `SummarizeCompressor`. - - `compressLangChainHistory(history, options)` high-level helper for one-call compression on `BaseMessage[]`. -- Tests for adapter behavior in `tests/langchain.test.ts`. -- Examples: - - `examples/LANGCHAIN_EXAMPLE.md`: adapting a LangChain model to `SlimContextChatModel`. - - `examples/LANGCHAIN_COMPRESS_HISTORY.md`: using `compressLangChainHistory` directly. - -### Changed - -- README updated with a LangChain adapter section and one-call usage sample. - -### Notes - -- The adapter treats LangChain `tool` messages as `assistant` during compression. -- `@langchain/core` is an optional peer dependency; only needed if you use the adapter. - -## [2.2.0] - 2025-08-28 +## [2.1.0] - 2025-08-28 ### Breaking @@ -49,10 +27,18 @@ The format is based on Keep a Changelog, and this project adheres to Semantic Ve ### Added +- LangChain adapter under `src/adapters/langchain.ts` with helpers: + - `extractContent`, `roleFromMessageType`, `baseToSlim`, `slimToLangChain` + - `toSlimModel(llm)` wrapper to use LangChain `BaseChatModel` with `SummarizeCompressor`. + - `compressLangChainHistory(history, options)` high-level helper for one-call compression on `BaseMessage[]`. +- Tests for adapter behavior in `tests/langchain.test.ts`. +- Examples: + - `examples/LANGCHAIN_EXAMPLE.md`: adapting a LangChain model to `SlimContextChatModel`. + - `examples/LANGCHAIN_COMPRESS_HISTORY.md`: using `compressLangChainHistory` directly. - `TokenEstimator` type for custom token estimation. - Docs and examples updated to reflect token-based configuration. -## [2.0.0] - 2025-08-24 +## [2.0.1] - 2025-08-24 ### Breaking @@ -89,3 +75,7 @@ The format is based on Keep a Changelog, and this project adheres to Semantic Ve ### Behavior - SummarizeCompressor alignment: after summarization, the first kept message following the summary is enforced to be a `user` message to maintain dialogue consistency. To achieve this while preserving recent context, the resulting message count may be `maxMessages - 1`, `maxMessages`, or `maxMessages + 1` depending on the split position. + +### Notes + +- `@langchain/core` is an optional peer dependency; only needed if you use the adapter. diff --git a/package.json b/package.json index 61c47b9..ca575a4 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "slimcontext", - "version": "2.2.0", + "version": "2.1.0", "description": "Lightweight, model-agnostic chat history compression (trim + summarize) for AI assistants.", "main": "dist/index.js", "types": "dist/index.d.ts",