From d68344fd78bcbd2cd2a556e528bae9316f2c1bbc Mon Sep 17 00:00:00 2001
From: Ali Ibrahim Jr <48456829+IBJunior@users.noreply.github.com>
Date: Fri, 29 Aug 2025 11:34:39 +0200
Subject: [PATCH 1/6] feat: token-threshold strategies + shared utilities;
 docs/examples updated; version 2.2.0

---
 .github/copilot-instructions.md        |  50 ++++----
 CHANGELOG.md                           |  28 ++++-
 README.md                              |  53 +++++---
 examples/LANGCHAIN_COMPRESS_HISTORY.md |  10 +-
 examples/LANGCHAIN_EXAMPLE.md          |   7 +-
 examples/OPENAI_EXAMPLE.md             |   7 +-
 package.json                           |   4 +-
 src/adapters/langchain.ts              |  15 ++-
 src/interfaces.ts                      |  15 +++
 src/strategies/common.ts               |  48 ++++++++
 src/strategies/summarize.ts            | 164 +++++++++++--------------
 src/strategies/trim.ts                 |  68 +++++++---
 tests/langchain.test.ts                |  18 ++-
 tests/summarize.test.ts                | 117 ++++--------------
 tests/trim.test.ts                     |  29 +++--
 15 files changed, 362 insertions(+), 271 deletions(-)
 create mode 100644 src/strategies/common.ts

diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
index a8825aa..58d10a6 100644
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -7,8 +7,6 @@
 ### High-Level Repository Information
 
 - **Type**: TypeScript npm library/package
-- **Source Files**: ~24
-- **Installed Size**: ~106MB (including all dependencies in node_modules)
 - **Languages**: TypeScript (primary), JavaScript (compiled output)
 - **Target Runtime**: Node.js (CommonJS modules)
 - **Framework**: Model-agnostic core; optional adapters (LangChain)
@@ -21,7 +19,7 @@
 
 ### Prerequisites and Environment Setup
 
-- **Node.js**: Version 20+ (as specified in the `engines` field of package.json)
+- **Node.js**: Version 20+ (recommended)
 - **Package Manager**: pnpm 10.14.0 (preferred) or npm (fallback)
 
 ### Critical Build Steps (Always Follow This Order)
@@ -41,42 +39,39 @@
 2. **Build the Project**
 
    ```bash
-   npm run build
+   pnpm run build
    # Compiles TypeScript to dist/ directory
-   # Duration: ~5-10 seconds
    ```
 
 3. **Run Tests**
 
    ```bash
-   npm run test
+   pnpm run test
    # Runs all vitest tests
-   # Duration: ~5-10 seconds
-   # Should show: "Test Files 3 passed (3), Tests 16 passed (16)"
+   # All tests should pass
    ```
 
 4. **Format Code**
 
    ```bash
-   npm run format        # Auto-format code
-   npm run format:check  # Check formatting without changes
+   pnpm run format        # Auto-format code
+   pnpm run format:check  # Check formatting without changes
    ```
 
 5. **Lint Code (Known Issue)**
    ```bash
-   npm run lint
+   pnpm run lint
    ```
-   **KNOWN ISSUE**: ESLint currently fails with "parserOptions.tsconfigRootDir must be an absolute path" error. This is a configuration bug but doesn't affect build or tests. The code itself is properly linted in CI environment.
 
 ### Complete Development Workflow
 
 ```bash
 # Clean start (recommended for agents):
 rm -rf node_modules dist
-npm install           # Always use npm for reliability
-npm run test         # Verify tests pass
-npm run format:check # Verify formatting
-npm run build        # Final build
+pnpm install           # Always use pnpm for reliability
+pnpm run test         # Verify tests pass
+pnpm run format:check # Verify formatting
+pnpm run build        # Final build
 ```
 
 ### CI/CD Pipeline Validation
@@ -103,8 +98,9 @@ The repository uses GitHub Actions CI that runs:
 │   ├── adapters/                 # Integration adapters (optional)
 │   │   └── langchain.ts          # LangChain adapter + helpers (compressLangChainHistory, toSlimModel)
 │   └── strategies/               # Compression strategy implementations
-│       ├── trim.ts               # TrimCompressor: keeps first + last N messages
-│       └── summarize.ts          # SummarizeCompressor: AI-powered summarization
+│       ├── common.ts             # Shared token-budget utilities & defaults (thresholds, estimator)
+│       ├── trim.ts               # TrimCompressor: token-threshold trimming (preserve system + recent)
+│       └── summarize.ts          # SummarizeCompressor: token-threshold summarization (inject summary)
 ├── tests/                        # vitest test files
 │   ├── trim.test.ts             # Tests for TrimCompressor
 │   ├── summarize.test.ts        # Tests for SummarizeCompressor
@@ -132,11 +128,19 @@ The repository uses GitHub Actions CI that runs:
 - `SlimContextMessage`: Standard message format with role ('system'|'user'|'assistant'|'tool'|'human') and content
 - `SlimContextChatModel`: BYOM interface requiring only `invoke(messages) -> response`
 - `SlimContextCompressor`: Strategy interface for compression implementations
+- `TokenEstimator`: `(message) => number` callback used for model-agnostic token budgeting
 
 **Compression Strategies**:
 
-- **TrimCompressor**: Simple strategy keeping first (system) message + last N-1 messages
-- **SummarizeCompressor**: AI-powered strategy that summarizes middle conversations when exceeding maxMessages
+- Token-threshold based design using the model’s max token window and a configurable threshold (default 70%).
+- Shared config shape (TokenBudgetConfig): `{ maxModelTokens?, thresholdPercent?, estimateTokens?, minRecentMessages? }`.
+- **TrimCompressor**: Drops the oldest non-system messages until estimated tokens fall below the threshold, while always preserving any system message(s) and at least the most recent `minRecentMessages`.
+- **SummarizeCompressor**: When over threshold, summarizes all messages before the recent tail (excluding the leading system message if present) and inserts a synthetic system summary just before the preserved recent messages.
+
+**Shared Utilities** (src/strategies/common.ts):
+
+- Defaults: `DEFAULT_MAX_MODEL_TOKENS = 8192`, `DEFAULT_THRESHOLD_PERCENT = 0.7`, `DEFAULT_MIN_RECENT_MESSAGES = 2`.
+- Estimator: `DEFAULT_ESTIMATOR` (~`len/4 + 2`) plus `computeThresholdTokens`, `normalizeBudgetConfig` helpers.
 
 **Framework Independence**: Core library has no framework dependencies. An optional LangChain adapter is provided for convenience; core remains BYOM.
 
@@ -153,8 +157,8 @@ The repository uses GitHub Actions CI that runs:
 ### Running Tests
 
 ```bash
-npm run test
-# Expects: ~16 tests across 3 files, all passing
+pnpm run test
+# Expects: All tests to pass
 # Tests cover TrimCompressor, SummarizeCompressor, and the LangChain adapter/helper
 ```
 
@@ -196,7 +200,7 @@ npm run test
     - CommonJS: `const { langchain } = require('slimcontext')`
     - ESM/TypeScript: `import * as slim from 'slimcontext'; const { langchain } = slim;`
   - Note: `import { langchain } from 'slimcontext'` may not work in all environments due to CJS/ESM interop. Prefer one of the patterns above.
-  - Includes a one-call history helper: `compressLangChainHistory(history, options)`
+  - Includes a one-call history helper: `compressLangChainHistory(history, options)` where `options` accepts the token-threshold fields (`maxModelTokens`, `thresholdPercent`, `estimateTokens`, `minRecentMessages`) and either `strategy: 'trim'` or `strategy: 'summarize'` with `llm` for the latter.
 
 ---
 
diff --git a/CHANGELOG.md b/CHANGELOG.md
index e1cbfec..714473d 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -26,6 +26,32 @@ The format is based on Keep a Changelog, and this project adheres to Semantic Ve
 - The adapter treats LangChain `tool` messages as `assistant` during compression.
 - `@langchain/core` is an optional peer dependency; only needed if you use the adapter.
 
+## [2.2.0] - 2025-08-28
+
+### Breaking
+
+- Strategies are now token-threshold based instead of message-count based.
+  - `TrimCompressor({ messagesToKeep })` replaced by `TrimCompressor({ maxModelTokens?, thresholdPercent?, estimateTokens?, minRecentMessages? })`.
+  - `SummarizeCompressor({ model, maxMessages, ... })` replaced by `SummarizeCompressor({ model, maxModelTokens?, thresholdPercent?, estimateTokens?, minRecentMessages?, prompt? })`.
+
+### Migration
+
+- Provide your model’s context window via `maxModelTokens` (optional; defaults to 8192).
+- Choose a `thresholdPercent` (0–1) at which to trigger compression (default 0.7; recommended 0.8–0.9 for aggressive usage).
+- Optional: pass a custom `estimateTokens` to better approximate token usage.
+- Optional: tune `minRecentMessages` (trim: default 2, summarize: default 4).
+- Update adapter/example usages accordingly (README and examples have been updated).
+
+### Changed
+
+- Trim: when total estimated tokens exceed threshold, drop oldest non-system messages until under threshold, preserving system messages and the most recent messages.
+- Summarize: when over threshold, summarize the oldest portion (excluding a leading system message) and insert a synthetic system summary before recent messages.
+
+### Added
+
+- `TokenEstimator` type for custom token estimation.
+- Docs and examples updated to reflect token-based configuration.
+
 ## [2.0.0] - 2025-08-24
 
 ### Breaking
@@ -41,7 +67,7 @@ The format is based on Keep a Changelog, and this project adheres to Semantic Ve
   - `IChatModel` -> `SlimContextChatModel`
   - `ICompressor` -> `SlimContextCompressor`
 
-Migration notes:
+### Migration
 
 - Import and use `SlimContextMessage` everywhere you previously used `Message` or `BaseMessage`.
 - Update any custom `IChatModel` implementations to accept `SlimContextMessage[]`.
diff --git a/README.md b/README.md
index 735ac96..845b433 100644
--- a/README.md
+++ b/README.md
@@ -11,8 +11,8 @@ Lightweight, model-agnostic chat history compression utilities for AI assistants
 
 ## Features
 
-- Trim strategy: keep the first (system) message and last N messages.
-- Summarize strategy: summarize the middle portion using your own chat model.
+- Trim strategy: token-aware trimming based on your model's max tokens and a threshold.
+- Summarize strategy: token-aware summarization of older messages using your own chat model.
 - Framework agnostic: plug in any model wrapper implementing a minimal `invoke()` interface.
 - Optional LangChain adapter with a one-call helper for compressing histories.
 
@@ -22,6 +22,12 @@ Lightweight, model-agnostic chat history compression utilities for AI assistants
 npm install slimcontext
 ```
 
+## Migration
+
+Upgrading from an earlier version? See the Migration notes in the changelog:
+
+- CHANGELOG: ./CHANGELOG.md#migration
+
 ## Core Concepts
 
 Provide a model that implements:
@@ -55,7 +61,15 @@ interface SlimContextMessage {
 ```ts
 import { TrimCompressor, SlimContextMessage } from 'slimcontext';
 
-const compressor = new TrimCompressor({ messagesToKeep: 8 });
+// Configure token-aware trimming
+const compressor = new TrimCompressor({
+  // Optional: defaults shown
+  maxModelTokens: 8192, // your model's context window
+  thresholdPercent: 0.7, // start trimming after 70% of maxModelTokens
+  minRecentMessages: 2, // always keep at least last 2 messages
+  // Optional estimator; default is a len/4 heuristic
+  // estimateTokens: (m) => yourCustomTokenCounter(m),
+});
 
 let history: SlimContextMessage[] = [
   { role: 'system', content: 'You are a helpful assistant.' },
@@ -84,7 +98,15 @@ class MyModel implements SlimContextChatModel {
 }
 
 const model = new MyModel();
-const compressor = new SummarizeCompressor({ model, maxMessages: 12 });
+const compressor = new SummarizeCompressor({
+  model,
+  // Optional: defaults shown
+  maxModelTokens: 8192,
+  thresholdPercent: 0.7, // summarize once total tokens exceed 70%
+  minRecentMessages: 4, // keep at least last 4 messages verbatim
+  // estimateTokens: (m) => yourCustomTokenCounter(m),
+  // prompt: '...custom summarization instructions...'
+});
 
 let history: SlimContextMessage[] = [
   { role: 'system', content: 'You are a helpful assistant.' },
@@ -96,21 +118,12 @@ history = await compressor.compress(history);
 
 Notes about summarization behavior
 
-- Alignment: after compression, messages will start with `[system, summary, ...]`, and the first kept message after the summary is always a `user` turn. This preserves dialogue consistency.
-- Size: to keep this alignment and preserve recency, the output length can be `maxMessages - 1`, `maxMessages`, or `maxMessages + 1`.
-  - Preference: if the default split lands on an assistant, we first try shifting forward by 1 (staying within `maxMessages`). If that still isn’t a user, we shift backward by 1 (allowing `maxMessages + 1`).
+- When the estimated total tokens exceed the threshold, the oldest portion (excluding a leading system message) is summarized into a single system message inserted before the recent tail.
+- The most recent `minRecentMessages` are always preserved verbatim.
 
 ### Strategy Combination Example
 
-You can chain strategies depending on size thresholds:
-
-```ts
-if (history.length > 50) {
-  history = await summarizeCompressor.compress(history);
-} else if (history.length > 25) {
-  history = await trimCompressor.compress(history);
-}
-```
+You can chain strategies depending on token thresholds or other heuristics.
 
 ## Example Integration
 
@@ -151,7 +164,9 @@ const history = [
 const compact = await langchain.compressLangChainHistory(history, {
   strategy: 'summarize',
   llm: lc, // BaseChatModel
-  maxMessages: 12,
+  maxModelTokens: 8192,
+  thresholdPercent: 0.8, // summarize beyond 80% of context window
+  minRecentMessages: 4,
 });
 ```
 
@@ -161,8 +176,8 @@ See `examples/LANGCHAIN_COMPRESS_HISTORY.md` for a fuller copy-paste example.
 
 ### Classes
 
-- `TrimCompressor({ messagesToKeep })`
-- `SummarizeCompressor({ model, maxMessages, prompt? })`
+- `TrimCompressor({ maxModelTokens?, thresholdPercent?, estimateTokens?, minRecentMessages? })`
+- `SummarizeCompressor({ model, maxModelTokens?, thresholdPercent?, estimateTokens?, minRecentMessages?, prompt? })`
 
 ### Interfaces
 
diff --git a/examples/LANGCHAIN_COMPRESS_HISTORY.md b/examples/LANGCHAIN_COMPRESS_HISTORY.md
index ab5482b..69b56a6 100644
--- a/examples/LANGCHAIN_COMPRESS_HISTORY.md
+++ b/examples/LANGCHAIN_COMPRESS_HISTORY.md
@@ -22,13 +22,17 @@ const history = [
 const compact = await langchain.compressLangChainHistory(history, {
   strategy: 'summarize',
   llm, // pass your BaseChatModel
-  maxMessages: 12, // target total messages after compression (system + summary + recent)
+  maxModelTokens: 8192,
+  thresholdPercent: 0.8,
+  minRecentMessages: 4,
 });
 
 // Alternatively, use trimming without an LLM:
 const trimmed = await langchain.compressLangChainHistory(history, {
   strategy: 'trim',
-  messagesToKeep: 8,
+  maxModelTokens: 8192,
+  thresholdPercent: 0.8,
+  minRecentMessages: 4,
 });
 
 console.log('Original size:', history.length);
@@ -39,4 +43,4 @@ console.log('Trimmed size:', trimmed.length);
 Notes
 
 - `@langchain/core` is an optional peer dependency. Install it only if you use the adapter.
-- `maxMessages` must be at least 4 for summarize (system + summary + 2 recent).
+- Summarize strategy summarizes older content when total tokens exceed `thresholdPercent * maxModelTokens`.
diff --git a/examples/LANGCHAIN_EXAMPLE.md b/examples/LANGCHAIN_EXAMPLE.md
index a47e277..b5e9dbf 100644
--- a/examples/LANGCHAIN_EXAMPLE.md
+++ b/examples/LANGCHAIN_EXAMPLE.md
@@ -37,7 +37,12 @@ class LangChainModel implements SlimContextChatModel {
 }
 
 async function compress(history: SlimContextMessage[]) {
-  const summarize = new SummarizeCompressor({ model: new LangChainModel(), maxMessages: 12 });
+  const summarize = new SummarizeCompressor({
+    model: new LangChainModel(),
+    maxModelTokens: 8192,
+    thresholdPercent: 0.75,
+    minRecentMessages: 4,
+  });
   return summarize.compress(history);
 }
 
diff --git a/examples/OPENAI_EXAMPLE.md b/examples/OPENAI_EXAMPLE.md
index 043f68c..467d67e 100644
--- a/examples/OPENAI_EXAMPLE.md
+++ b/examples/OPENAI_EXAMPLE.md
@@ -33,7 +33,12 @@ async function main() {
     // ... conversation grows
   ];
 
-  const summarize = new SummarizeCompressor({ model: new OpenAIModel(), maxMessages: 10 });
+  const summarize = new SummarizeCompressor({
+    model: new OpenAIModel(),
+    maxModelTokens: 128000,
+    thresholdPercent: 0.8,
+    minRecentMessages: 4,
+  });
   const compressed = await summarize.compress(history);
 
   const completion = await client.chat.completions.create({
diff --git a/package.json b/package.json
index 000928b..f06d292 100644
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
 {
   "name": "slimcontext",
-  "version": "2.1.0",
+  "version": "2.2.0",
   "description": "Lightweight, model-agnostic chat history compression (trim + summarize) for AI assistants.",
   "main": "dist/index.js",
   "types": "dist/index.d.ts",
@@ -13,7 +13,7 @@
   ],
   "scripts": {
     "build": "tsc",
-    "prepare": "npm run build",
+    "prepare": "pnpm run build",
     "test": "vitest run",
     "test:watch": "vitest",
     "lint": "eslint . --ext .ts,.tsx --max-warnings=0",
diff --git a/src/adapters/langchain.ts b/src/adapters/langchain.ts
index 0d3c4c6..ff97503 100644
--- a/src/adapters/langchain.ts
+++ b/src/adapters/langchain.ts
@@ -133,7 +133,7 @@ export function toSlimModel(llm: BaseChatModel): SlimContextChatModel {
   return new LangChainSlimModel(llm);
 }
 
-/** Convenience: build a SummarizeCompressor for LangChain models. */
+/** Convenience: build a SummarizeCompressor for LangChain models (token-threshold based). */
 export function createSummarizeCompressorForLangChain(
   llm: BaseChatModel,
   config: Omit<SummarizeConfig, 'model'>,
@@ -141,11 +141,19 @@ export function createSummarizeCompressorForLangChain(
   return new SummarizeCompressor({ model: toSlimModel(llm), ...config });
 }
 
-/** Convenience: build a TrimCompressor. */
+/** Convenience: build a TrimCompressor (token-threshold based). */
 export function createTrimCompressor(config: TrimConfig): TrimCompressor {
   return new TrimCompressor(config);
 }
 
+/**
+ * Options for compressLangChainHistory (token-threshold based).
+ *
+ * Provide one of:
+ * - { compressor }: a pre-built SlimContextCompressor instance
+ * - summarize: { strategy?: 'summarize', llm, maxModelTokens?, thresholdPercent?, estimateTokens?, minRecentMessages?, prompt? }
+ * - trim: { strategy: 'trim', maxModelTokens?, thresholdPercent?, estimateTokens?, minRecentMessages? }
+ */
 export type CompressLangChainOptions =
   | { compressor: SlimContextCompressor }
   | ({
@@ -157,6 +165,9 @@ export type CompressLangChainOptions =
 /**
  * High-level helper: compress a LangChain message history in one call.
  * - Converts LC -> SlimContext, runs a compressor, and converts the result back.
+ * - Strategies trigger when estimated total tokens exceed `thresholdPercent * maxModelTokens`.
+ * - For summarize, older content is summarized and a system summary is inserted before recent messages.
+ * - For trim, oldest non-system messages are dropped until under threshold, preserving system + recent.
  */
 export async function compressLangChainHistory(
   history: BaseMessage[],
diff --git a/src/interfaces.ts b/src/interfaces.ts
index 66ad539..1299397 100644
--- a/src/interfaces.ts
+++ b/src/interfaces.ts
@@ -18,3 +18,18 @@ export interface SlimContextChatModel {
 export interface SlimContextCompressor {
   compress(messages: SlimContextMessage[]): Promise<SlimContextMessage[]>;
 }
+
+export interface TokenBudgetConfig {
+  /** Model's maximum token context window. Default: 8192. */
+  maxModelTokens?: number;
+  /** Percentage threshold to trigger compression (0-1). Default: 0.7. */
+  thresholdPercent?: number;
+  /** Custom token estimator for messages. Default: len/4 heuristic. */
+  estimateTokens?: TokenEstimator;
+  /** Minimum recent messages to always preserve. Strategy-specific default. */
+  minRecentMessages?: number;
+}
+
+// Token estimation callback for model-agnostic budgeting.
+// Return an estimated token count for a single message.
+export type TokenEstimator = (message: SlimContextMessage) => number;
diff --git a/src/strategies/common.ts b/src/strategies/common.ts
new file mode 100644
index 0000000..029e780
--- /dev/null
+++ b/src/strategies/common.ts
@@ -0,0 +1,48 @@
+import type { SlimContextMessage, TokenBudgetConfig, TokenEstimator } from '../interfaces';
+
+// Default constants for token budgeting
+export const DEFAULT_MAX_MODEL_TOKENS = 8192;
+export const DEFAULT_THRESHOLD_PERCENT = 0.7; // 70%
+export const DEFAULT_MIN_RECENT_MESSAGES = 2; // strategy-specific override allowed
+export const DEFAULT_ESTIMATOR_TOKEN_BIAS = 2;
+
+/** Common token-budget fields shared by strategies. */
+export interface NormalizedBudgetConfig {
+  maxModelTokens: number;
+  thresholdPercent: number;
+  estimateTokens: TokenEstimator;
+  minRecentMessages: number;
+}
+/** Default token estimator: rough approximation len/4 + 2. */
+export const DEFAULT_ESTIMATOR: TokenEstimator = (m: SlimContextMessage) =>
+  Math.ceil(m.content.length / 4) + DEFAULT_ESTIMATOR_TOKEN_BIAS;
+
+/** Normalize token-budget config with strategy-specific defaults. */
+export function normalizeBudgetConfig(
+  config: TokenBudgetConfig,
+  options?: { minRecentDefault?: number },
+): NormalizedBudgetConfig {
+  const minRecentDefault = options?.minRecentDefault ?? DEFAULT_MIN_RECENT_MESSAGES;
+  return {
+    maxModelTokens: config.maxModelTokens ?? DEFAULT_MAX_MODEL_TOKENS,
+    thresholdPercent: config.thresholdPercent ?? DEFAULT_THRESHOLD_PERCENT,
+    estimateTokens: config.estimateTokens ?? DEFAULT_ESTIMATOR,
+    minRecentMessages: Math.max(0, config.minRecentMessages ?? minRecentDefault),
+  };
+}
+
+/** Compute threshold token budget. */
+export function computeThresholdTokens(maxModelTokens: number, thresholdPercent: number): number {
+  return Math.floor(maxModelTokens * thresholdPercent);
+}
+
+/** Estimate the total tokens for an array of messages. */
+export function estimateTotalTokens(
+  messages: SlimContextMessage[],
+  estimateTokens: TokenEstimator,
+): number {
+  if (messages.length === 0) return 0;
+  let total = 0;
+  for (const m of messages) total += estimateTokens(m);
+  return total;
+}
diff --git a/src/strategies/summarize.ts b/src/strategies/summarize.ts
index 987e6fa..869e878 100644
--- a/src/strategies/summarize.ts
+++ b/src/strategies/summarize.ts
@@ -1,58 +1,71 @@
-import { SlimContextCompressor, SlimContextChatModel, SlimContextMessage } from '../interfaces';
+import {
+  SlimContextCompressor,
+  SlimContextChatModel,
+  SlimContextMessage,
+  type TokenBudgetConfig,
+} from '../interfaces';
+import { normalizeBudgetConfig, computeThresholdTokens } from './common';
 
 const DEFAULT_SUMMARY_PROMPT = `
-You are an expert conversation summarizer. You'll receive an excerpt of a chat transcript to condense.
-
-Goals:
-- Be concise while retaining key facts, entities, user intent, decisions, follow-ups, and resolutions.
-- Preserve important numbers, dates, IDs (truncate if long), and constraints.
-
-When tool messages are present (role: tool or similar):
-- Briefly note which tool(s) were called, why (the user/assistant intent), and the high-level outcome.
-- Do NOT copy raw JSON, logs, or code. Extract only salient fields (e.g., status, count/total, top IDs, amounts, dates, error messages).
-- If outputs are very long, compress to 1–2 sentences. Truncate long IDs (e.g., abc…123) and omit secrets.
-- If multiple tools were called for the same purpose, summarize them together.
-- If a tool failed or contradicted prior assumptions, note the discrepancy.
-
-Output format:
-- Output only the summary as a single concise paragraph (2–5 sentences). No preface, no headings.
-
-Examples:
-Input (excerpt):
-user: Please find docs about OAuth token errors in our KB
-assistant: I will search the knowledge base
-assistant: calling search_kb with query "OAuth token expired"
-tool: { "results": [ { "title": "Token expired", "fix": "Refresh or sync clock" }, { "title": "Clock skew", "fix": "NTP sync" } ] }
-assistant: The docs suggest refreshing tokens and checking clock skew
-
-Summary:
-User requested guidance on OAuth token errors. Assistant searched the KB; the tool returned articles about token expiration and clock skew. Assistant advised refreshing tokens and ensuring time sync.
+You are a conversation summarizer.
+You will receive a transcript of a conversation in the following format:
+
+<format>
+user : user message
+assistant : assistant message
+...
+</format>
+
+Your task is to produce a concise summary of the conversation that can be used as a system message for continuing the dialogue.
+
+Guidelines:
+
+- Capture all important facts, decisions, user goals, and assistant outputs.
+
+- Preserve any constraints, preferences, or instructions given by the user.
+
+- Omit small talk, filler, or irrelevant details.
+
+- Be concise, but include enough information so the assistant can seamlessly continue the conversation without the full transcript.
+
+- Write the summary in neutral, factual style (not conversational).
+
+Output format (only the summary, no additional text): <summary here>
+
+Example:
+Input transcript:
+user : I want to build an AI agent in TypeScript that can search Google and store notes in Notion.  
+assistant : You could use LangGraph.js with a Google Search tool and a Notion connector. Do you want me to scaffold an example?  
+user : Yes, but make it simple first without authentication.  
+assistant : Sure, I’ll prepare a minimal scaffold with those two tools integrated.  
+
+Output:
+The user wants an AI agent in TypeScript using LangGraph.js with Google Search and Notion integration.  
+They prefer a simple scaffold without authentication.  
+The assistant suggested creating an example, and the user agreed.  
+
 `;
 
-export interface SummarizeConfig {
+export interface SummarizeConfig extends TokenBudgetConfig {
   model: SlimContextChatModel;
-  maxMessages: number; // total messages desired after compression (including system + summary + retained recent messages)
+  /** Prompt used to produce the summary */
   prompt?: string;
 }
 
 /**
- * SummarizeCompressor summarizes the middle portion of the conversation when it grows beyond maxMessages.
- * It keeps the original first system message, injects a synthetic summary system message, and retains
- * the most recent messages up to the maxMessages budget.
+ * SummarizeCompressor summarizes older messages when the estimated total tokens exceed
+ * a configurable threshold of the model's max context window. It preserves the leading
+ * system message (if present), injects a synthetic system summary, and retains the most
+ * recent `minRecentMessages` verbatim.
  */
 export class SummarizeCompressor implements SlimContextCompressor {
   private readonly model: SlimContextChatModel;
-  private readonly maxMessages: number;
   private readonly summaryPrompt: string;
+  private cfg: ReturnType<typeof normalizeBudgetConfig>;
 
   constructor(config: SummarizeConfig) {
-    if (config.maxMessages < 4) {
-      throw new Error(
-        'maxMessages should be at least 4 to allow system + summary + 2 recent messages',
-      );
-    }
     this.model = config.model;
-    this.maxMessages = config.maxMessages;
+    this.cfg = normalizeBudgetConfig(config, { minRecentDefault: 4 });
     this.summaryPrompt = config.prompt || DEFAULT_SUMMARY_PROMPT;
   }
 
@@ -60,20 +73,26 @@ export class SummarizeCompressor implements SlimContextCompressor {
    * Compress the conversation history by summarizing the middle portion.
    */
   async compress(messages: SlimContextMessage[]): Promise<SlimContextMessage[]> {
-    if (messages.length <= this.maxMessages) {
-      return messages;
-    }
+    const thresholdTokens = computeThresholdTokens(
+      this.cfg.maxModelTokens,
+      this.cfg.thresholdPercent,
+    );
+    const tokenCounts = messages.map((m) => this.cfg.estimateTokens(m));
+    const total = tokenCounts.reduce((a, b) => a + b, 0);
+
+    if (total <= thresholdTokens) return messages;
 
+    // We'll keep the last `minRecentMessages` untouched, and summarize everything before them
+    const keepTailStart = Math.max(0, messages.length - this.cfg.minRecentMessages);
     const hasSystemFirst = messages[0]?.role === 'system';
     const systemMessage = hasSystemFirst ? messages[0] : undefined;
-    // Decide where the kept tail should start (ensuring it starts with a user message when possible)
-    const startIdx = this.computeKeepStartIndex(messages, hasSystemFirst);
-    const messagesToKeep = messages.slice(startIdx);
 
-    // Everything between the first system message and the slice we keep is summarized
-    const endOfSummarizedIndex = startIdx; // non-inclusive
+    // Exclude leading system from summary input; we re-insert it unchanged
     const summarizeStart = hasSystemFirst ? 1 : 0;
-    const messagesToSummarize = messages.slice(summarizeStart, endOfSummarizedIndex);
+    const messagesToSummarize = messages.slice(summarizeStart, keepTailStart);
+
+    // If there is barely anything to summarize, just return messages
+    if (messagesToSummarize.length === 0) return messages;
 
     const conversationText = messagesToSummarize
       .map((msg) => `${msg.role}: ${msg.content}`)
@@ -81,7 +100,7 @@ export class SummarizeCompressor implements SlimContextCompressor {
 
     const promptMessages: SlimContextMessage[] = [
       { role: 'system', content: this.summaryPrompt },
-      { role: 'user', content: conversationText },
+      { role: 'user', content: `Input transcript: \n ${conversationText}` },
     ];
 
     const response = await this.model.invoke(promptMessages);
@@ -89,49 +108,14 @@ export class SummarizeCompressor implements SlimContextCompressor {
 
     const summaryMessage: SlimContextMessage = {
       role: 'system',
-      content: `[Context from a summarized portion of the conversation between you and the user]: ${summaryText}`,
+      content: `${summaryText}`,
     };
 
-    if (hasSystemFirst && systemMessage) {
-      return [systemMessage, summaryMessage, ...messagesToKeep];
-    }
-    return [summaryMessage, ...messagesToKeep];
-  }
+    const keptTail = messages.slice(keepTailStart);
+    const result: SlimContextMessage[] = [];
+    if (systemMessage) result.push(systemMessage);
+    result.push(summaryMessage, ...keptTail);
 
-  /**
-   * Compute the start index of the kept tail after inserting a summary.
-   * Default budget keeps: system + summary + (maxMessages - 2) recent messages.
-   * To keep conversation turn consistency, we try to ensure the first kept message is a 'user'.
-   * If the default split lands on a non-user, we first try shifting forward by 1 (<= maxMessages),
-   * otherwise we try shifting backward by 1 (allowing maxMessages + 1 total).
-   */
-  private computeKeepStartIndex(messages: SlimContextMessage[], hasSystemFirst: boolean): number {
-    const reservedSlots = hasSystemFirst ? 2 : 1; // system? + summary
-    const baseRecentBudget = this.maxMessages - reservedSlots;
-    let startIdx = messages.length - baseRecentBudget;
-
-    // Guardrails: ensure startIdx within [minStart, messages.length)
-    const minStart = hasSystemFirst ? 1 : 0;
-    if (startIdx < minStart) startIdx = minStart;
-    if (startIdx >= messages.length) startIdx = messages.length - 1;
-
-    const firstKept = messages[startIdx];
-    if (firstKept && firstKept.role !== 'user') {
-      // Try shifting forward by 1 (dropping one more from summarized middle)
-      if (startIdx + 1 < messages.length) {
-        const candidate = messages[startIdx + 1];
-        if (candidate.role === 'user') {
-          return startIdx + 1;
-        }
-      }
-      // Otherwise, try shifting backward by 1 (keeping one more, allowing +1 over max)
-      if (startIdx - 1 >= minStart) {
-        const candidateBack = messages[startIdx - 1];
-        if (candidateBack.role === 'user') {
-          return startIdx - 1;
-        }
-      }
-    }
-    return startIdx;
+    return result;
   }
 }
diff --git a/src/strategies/trim.ts b/src/strategies/trim.ts
index f1462cb..34bc124 100644
--- a/src/strategies/trim.ts
+++ b/src/strategies/trim.ts
@@ -1,33 +1,65 @@
-import { SlimContextCompressor, SlimContextMessage } from '../interfaces';
+import { SlimContextCompressor, SlimContextMessage, type TokenBudgetConfig } from '../interfaces';
+import { normalizeBudgetConfig, computeThresholdTokens } from './common';
 
 /**
- * Trim configuration options for the TrimCompressor (messages to keep).
+ * Trim configuration options for the TrimCompressor using token thresholding.
  */
-export interface TrimConfig {
-  messagesToKeep: number;
-}
+export type TrimConfig = TokenBudgetConfig;
 
 /**
- * TrimCompressor keeps the very first message (often a system prompt) and the last N-1 messages.
+ * TrimCompressor drops the oldest non-system messages until the estimated token
+ * usage falls below the configured threshold, preserving any system messages and
+ * the most recent conversation turns.
  */
 export class TrimCompressor implements SlimContextCompressor {
-  private readonly messagesToKeep: number;
+  private cfg: ReturnType<typeof normalizeBudgetConfig>;
 
   constructor(config: TrimConfig) {
-    if (config.messagesToKeep < 2) {
-      throw new Error(
-        'messagesToKeep must be at least 2 to retain the first system message and one recent message',
-      );
-    }
-    this.messagesToKeep = config.messagesToKeep;
+    this.cfg = normalizeBudgetConfig(config, { minRecentDefault: 2 });
   }
 
   async compress(messages: SlimContextMessage[]): Promise<SlimContextMessage[]> {
-    if (messages.length <= this.messagesToKeep) {
-      return messages;
+    const thresholdTokens = computeThresholdTokens(
+      this.cfg.maxModelTokens,
+      this.cfg.thresholdPercent,
+    );
+
+    // Compute total tokens
+    const tokenCounts = messages.map((m) => this.cfg.estimateTokens(m));
+    let total = tokenCounts.reduce((a, b) => a + b, 0);
+    if (total <= thresholdTokens) return messages;
+
+    // Determine the earliest index we are allowed to drop up to, preserving recent messages
+    const preserveFromIndex = Math.max(0, messages.length - this.cfg.minRecentMessages);
+
+    const keepMask = new Array(messages.length).fill(true);
+
+    // Drop from the oldest non-system messages forward until under threshold,
+    // but never drop system messages or any message within the last `minRecentMessages`.
+    for (let i = 0; i < messages.length && total > thresholdTokens; i++) {
+      const msg = messages[i];
+      const isRecentProtected = i >= preserveFromIndex;
+      const isSystem = msg.role === 'system';
+      if (isRecentProtected || isSystem) continue;
+      // Drop it
+      keepMask[i] = false;
+      total -= tokenCounts[i];
+    }
+
+    // If still over threshold (e.g., many system messages or very long recent messages),
+    // continue dropping from the left side before the preserved tail, still skipping systems.
+    for (let i = 0; i < preserveFromIndex && total > thresholdTokens; i++) {
+      if (!keepMask[i]) continue;
+      const msg = messages[i];
+      if (msg.role === 'system') continue;
+      keepMask[i] = false;
+      total -= tokenCounts[i];
+    }
+
+    const result: SlimContextMessage[] = [];
+    for (let i = 0; i < messages.length; i++) {
+      if (keepMask[i]) result.push(messages[i]);
     }
-    const systemMessage = messages[0];
-    const recentMessages = messages.slice(-this.messagesToKeep + 1);
-    return [systemMessage, ...recentMessages];
+    return result;
   }
 }
diff --git a/tests/langchain.test.ts b/tests/langchain.test.ts
index 8821d17..28eff3a 100644
--- a/tests/langchain.test.ts
+++ b/tests/langchain.test.ts
@@ -87,10 +87,14 @@ describe('LangChain Adapter', () => {
     it('should compress history with the trim strategy', async () => {
       const compressed = await compressLangChainHistory(history, {
         strategy: 'trim',
-        messagesToKeep: 3,
+        maxModelTokens: 400,
+        thresholdPercent: 0.5,
+        minRecentMessages: 2,
+        estimateTokens: () => 150, // each message ~150 tokens
       });
-
-      expect(compressed).toHaveLength(3); // System + 2 kept
+      // System (150) + last two messages (300) = 450 > threshold 200, but
+      // our TrimCompressor preserves last two regardless; will drop earlier non-systems.
+      expect(compressed.length).toBe(3);
       expect(compressed[0]).toBeInstanceOf(SystemMessage);
       expect(compressed[1]).toBeInstanceOf(HumanMessage);
       expect(compressed[1].content).toBe('Message 3');
@@ -105,11 +109,15 @@ describe('LangChain Adapter', () => {
       const compressed = await compressLangChainHistory(history, {
         strategy: 'summarize',
         llm: mockModel,
-        maxMessages: 4,
+        maxModelTokens: 300,
+        thresholdPercent: 0.5,
+        estimateTokens: () => 100,
+        minRecentMessages: 2,
       });
 
       expect(invokeSpy).toHaveBeenCalled();
-      expect(compressed).toHaveLength(4); // System + summary + 2 kept
+      // Should be: system + summary + last 2 messages
+      expect(compressed).toHaveLength(4);
       expect(compressed[0]).toBeInstanceOf(SystemMessage);
       expect(compressed[1]).toBeInstanceOf(SystemMessage); // Summary is an System Message
       expect(compressed[1].content).toContain('This is a summary of messages 1 and 2.');
diff --git a/tests/summarize.test.ts b/tests/summarize.test.ts
index 26e9779..0af9abb 100644
--- a/tests/summarize.test.ts
+++ b/tests/summarize.test.ts
@@ -8,14 +8,20 @@ import {
 } from '../src';
 
 describe('SummarizeCompressor', () => {
-  it('inserts a summary and respects maxMessages', async () => {
+  it('inserts a summary before recent messages when over token threshold', async () => {
     const fakeModel: SlimContextChatModel = {
       async invoke(_msgs: SlimContextMessage[]): Promise<SlimContextModelResponse> {
         return { content: 'fake summary' };
       },
     };
 
-    const summarize = new SummarizeCompressor({ model: fakeModel, maxMessages: 6 });
+    const summarize = new SummarizeCompressor({
+      model: fakeModel,
+      maxModelTokens: 400,
+      thresholdPercent: 0.5, // 200 tokens
+      estimateTokens: () => 50, // each message 50 tokens
+      minRecentMessages: 2,
+    });
 
     const history: SlimContextMessage[] = [
       { role: 'system', content: 'sys' },
@@ -23,19 +29,27 @@ describe('SummarizeCompressor', () => {
     ];
 
     const result = await summarize.compress(history);
-    expect(result.length).toBeLessThanOrEqual(6);
+    // Should be: system, summary, last 2 user messages
     expect(result[0].content).toBe('sys');
     expect(result[1].content).toContain('fake summary');
+    expect(result.at(-2)?.content).toBe('u8');
+    expect(result.at(-1)?.content).toBe('u9');
   });
 
-  it("works when first message isn't system; only reserves summary", async () => {
+  it("works when first message isn't system; only adds summary before recent messages", async () => {
     const fakeModel: SlimContextChatModel = {
       async invoke(_msgs: SlimContextMessage[]): Promise<SlimContextModelResponse> {
         return { content: 'fake summary' };
       },
     };
 
-    const summarize = new SummarizeCompressor({ model: fakeModel, maxMessages: 6 });
+    const summarize = new SummarizeCompressor({
+      model: fakeModel,
+      maxModelTokens: 300,
+      thresholdPercent: 0.5,
+      estimateTokens: () => 50,
+      minRecentMessages: 2,
+    });
 
     // Start with user instead of system, then alternate and end with user
     const history: SlimContextMessage[] = [];
@@ -45,98 +59,9 @@ describe('SummarizeCompressor', () => {
     }
 
     const result = await summarize.compress(history);
-    expect(result.length).toBeLessThanOrEqual(6);
     expect(result[0].role).toBe('system'); // summary system (no original system preserved)
     expect(result[0].content).toContain('fake summary');
-    expect(result[1].role).toBe('user'); // first kept remains aligned to user
-  });
-});
-
-describe('SummarizeCompressor split alignment', () => {
-  const fakeModel: SlimContextChatModel = {
-    async invoke(_msgs: SlimContextMessage[]): Promise<SlimContextModelResponse> {
-      return { content: 'fake summary' };
-    },
-  };
-
-  it('shifts forward by 1 so the first kept message is a user (<= maxMessages)', async () => {
-    // Use a strictly alternating conversation ending with user.
-    // For maxMessages = 6 => baseRecentBudget = 4 => startIdx = len - 4.
-    // With len = 10, startIdx = 6 (assistant), so forward shift to 7 (user).
-    const summarize = new SummarizeCompressor({ model: fakeModel, maxMessages: 6 });
-    const history: SlimContextMessage[] = [
-      { role: 'system', content: 'sys' }, // 0
-      { role: 'user', content: 'u1' }, // 1
-      { role: 'assistant', content: 'a1' }, // 2
-      { role: 'user', content: 'u2' }, // 3
-      { role: 'assistant', content: 'a2' }, // 4
-      { role: 'user', content: 'u3' }, // 5
-      { role: 'assistant', content: 'a3' }, // 6 <- base startIdx (assistant)
-      { role: 'user', content: 'u4' }, // 7 <- candidate forward (user)
-      { role: 'assistant', content: 'a4' }, // 8
-      { role: 'user', content: 'u5' }, // 9 (ends with user)
-    ]; // len = 10
-
-    const result = await summarize.compress(history);
-    // After system + summary, the first kept should be a user
-    expect(result[2].role).toBe('user');
-    // Forward shift reduces total by 1
-    expect(result.length).toBe(5); // maxMessages - 1
-  });
-
-  it('shifts backward by 1 when forward is not user, allowing maxMessages + 1', async () => {
-    // maxMessages = 6 => baseRecentBudget = 4 => startIdx = len - 4
-    const summarize = new SummarizeCompressor({ model: fakeModel, maxMessages: 6 });
-    const history: SlimContextMessage[] = [
-      { role: 'system', content: 'sys' }, // 0
-      { role: 'user', content: 'u1' }, // 1
-      { role: 'assistant', content: 'a1' }, // 2
-      { role: 'user', content: 'u2' }, // 3
-      { role: 'user', content: 'u2b' }, // 4 <- candidate backward (user)
-      { role: 'assistant', content: 'a3' }, // 5 <- base startIdx (assistant)
-      { role: 'assistant', content: 'a4' }, // 6 <- candidate forward (assistant)
-      { role: 'user', content: 'u3' }, // 7
-      { role: 'assistant', content: 'a5' }, // 8
-    ]; // len = 9, startIdx = 5
-
-    const result = await summarize.compress(history);
-    // After system + summary, the first kept should be a user (from index 4)
-    expect(result[2].role).toBe('user');
-    // Backward shift increases total by 1
-    expect(result.length).toBe(7); // maxMessages + 1
-  });
-
-  it('ensures first kept message is user for alternating history (maxMessages=12)', async () => {
-    const summarize = new SummarizeCompressor({ model: fakeModel, maxMessages: 12 });
-
-    // Build an alternating conversation: system, user, assistant, user, ... ending with user
-    const history: SlimContextMessage[] = [{ role: 'system', content: 'sys' }];
-    for (let i = 1; i <= 25; i++) {
-      const role = i % 2 === 1 ? 'user' : 'assistant';
-      history.push({ role: role as 'user' | 'assistant', content: `${role[0]}${i}` });
-    }
-
-    const result = await summarize.compress(history);
-    expect(result[0].role).toBe('system'); // original system
-    expect(result[1].role).toBe('system'); // summary system
-    expect(result[2].role).toBe('user'); // first kept must be user
-    // For alternating history ending with user: total becomes maxMessages - 1 (11)
-    expect(result.length).toBe(11);
-  });
-
-  it('keeps exactly maxMessages when base split already lands on user', async () => {
-    const summarize = new SummarizeCompressor({ model: fakeModel, maxMessages: 12 });
-    // Construct length so startIdx = len - (12-2) = len - 10 is odd (user at that index)
-    // Let len = 27 => startIdx = 17 (odd). Build alternating ending with user.
-    const history: SlimContextMessage[] = [{ role: 'system', content: 'sys' }];
-    for (let i = 1; i <= 26; i++) {
-      const role = i % 2 === 1 ? 'user' : 'assistant';
-      history.push({ role: role as 'user' | 'assistant', content: `${role[0]}${i}` });
-    }
-    const result = await summarize.compress(history);
-    expect(result[0].role).toBe('system');
-    expect(result[1].role).toBe('system');
-    expect(result[2].role).toBe('user');
-    expect(result.length).toBe(12);
+    expect(result.at(-2)?.role).toBe('user');
+    expect(result.at(-1)?.role).toBe('assistant');
   });
 });
diff --git a/tests/trim.test.ts b/tests/trim.test.ts
index c4a4388..aaa33b1 100644
--- a/tests/trim.test.ts
+++ b/tests/trim.test.ts
@@ -3,20 +3,29 @@ import { describe, it, expect } from 'vitest';
 import { TrimCompressor, SlimContextMessage } from '../src';
 
 describe('TrimCompressor', () => {
-  it('keeps first system and last N-1 messages', async () => {
-    const trim = new TrimCompressor({ messagesToKeep: 5 });
+  it('drops oldest non-system messages until under threshold (preserves system + recent)', async () => {
+    const estimate = (_m: SlimContextMessage) => 100; // deterministic
+    const trim = new TrimCompressor({
+      maxModelTokens: 400,
+      thresholdPercent: 0.5, // threshold = 200
+      estimateTokens: estimate,
+      minRecentMessages: 2,
+    });
     const history: SlimContextMessage[] = [
-      { role: 'system', content: 'sys' },
-      { role: 'user', content: 'u1' },
-      { role: 'assistant', content: 'a1' },
-      { role: 'user', content: 'u2' },
-      { role: 'assistant', content: 'a2' },
-      { role: 'user', content: 'u3' },
-    ];
+      { role: 'system', content: 'sys' }, // 100
+      { role: 'user', content: 'u1' }, // 100
+      { role: 'assistant', content: 'a1' }, // 100
+      { role: 'user', content: 'u2' }, // 100
+      { role: 'assistant', content: 'a2' }, // 100
+      { role: 'user', content: 'u3' }, // 100
+    ]; // total 600 > threshold 200
 
     const trimmed = await trim.compress(history);
-    expect(trimmed.length).toBe(5);
+    // We expect to preserve the system and last 2 messages when possible
     expect(trimmed[0]).toEqual({ role: 'system', content: 'sys' });
+    expect(trimmed.at(-2)).toEqual({ role: 'assistant', content: 'a2' });
     expect(trimmed.at(-1)).toEqual({ role: 'user', content: 'u3' });
+    // Older non-system messages should be dropped
+    expect(trimmed.length).toBe(3);
   });
 });

From 362de7cfdffc505212bae3e94fb485c97c46a4f6 Mon Sep 17 00:00:00 2001
From: Ali Ibrahim Jr <48456829+IBJunior@users.noreply.github.com>
Date: Fri, 29 Aug 2025 11:49:50 +0200
Subject: [PATCH 2/6] fix: addressed PR issue for prepare script

---
 package.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/package.json b/package.json
index f06d292..d0a1957 100644
--- a/package.json
+++ b/package.json
@@ -13,7 +13,7 @@
   ],
   "scripts": {
     "build": "tsc",
-    "prepare": "pnpm run build",
+    "prepare": "npm run build",
     "test": "vitest run",
     "test:watch": "vitest",
     "lint": "eslint . --ext .ts,.tsx --max-warnings=0",

From f141187e1c1a00580f60a3d070ba752ce250316b Mon Sep 17 00:00:00 2001
From: Ali Ibrahim Jr <48456829+IBJunior@users.noreply.github.com>
Date: Fri, 29 Aug 2025 11:58:10 +0200
Subject: [PATCH 3/6] updated DEFAULT_MIN_RECENT_MESSAGES default value

---
 src/strategies/common.ts | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/strategies/common.ts b/src/strategies/common.ts
index 029e780..854f7e4 100644
--- a/src/strategies/common.ts
+++ b/src/strategies/common.ts
@@ -3,7 +3,7 @@ import type { SlimContextMessage, TokenBudgetConfig, TokenEstimator } from '../i
 // Default constants for token budgeting
 export const DEFAULT_MAX_MODEL_TOKENS = 8192;
 export const DEFAULT_THRESHOLD_PERCENT = 0.7; // 70%
-export const DEFAULT_MIN_RECENT_MESSAGES = 2; // strategy-specific override allowed
+export const DEFAULT_MIN_RECENT_MESSAGES = 10; // strategy-specific override allowed
 export const DEFAULT_ESTIMATOR_TOKEN_BIAS = 2;
 
 /** Common token-budget fields shared by strategies. */

From 1750d911a0cb7aab56740168666e6d82ff1541ec Mon Sep 17 00:00:00 2001
From: Ali Ibrahim Jr <48456829+IBJunior@users.noreply.github.com>
Date: Fri, 29 Aug 2025 12:00:19 +0200
Subject: [PATCH 4/6] updated example model to gpt-5-mini

---
 README.md                              | 2 +-
 examples/LANGCHAIN_COMPRESS_HISTORY.md | 2 +-
 examples/LANGCHAIN_EXAMPLE.md          | 2 +-
 examples/OPENAI_EXAMPLE.md             | 4 ++--
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/README.md b/README.md
index 845b433..8eb13ad 100644
--- a/README.md
+++ b/README.md
@@ -152,7 +152,7 @@ import { AIMessage, HumanMessage, SystemMessage } from '@langchain/core/messages
 import { ChatOpenAI } from '@langchain/openai';
 import { langchain } from 'slimcontext';
 
-const lc = new ChatOpenAI({ model: 'gpt-4o-mini', temperature: 0 });
+const lc = new ChatOpenAI({ model: 'gpt-5-mini', temperature: 0 });
 
 const history = [
   new SystemMessage('You are helpful.'),
diff --git a/examples/LANGCHAIN_COMPRESS_HISTORY.md b/examples/LANGCHAIN_COMPRESS_HISTORY.md
index 69b56a6..cc0b212 100644
--- a/examples/LANGCHAIN_COMPRESS_HISTORY.md
+++ b/examples/LANGCHAIN_COMPRESS_HISTORY.md
@@ -8,7 +8,7 @@ import { ChatOpenAI } from '@langchain/openai';
 import { langchain } from 'slimcontext';
 
 // 1) Create your LangChain chat model (any BaseChatModel works)
-const llm = new ChatOpenAI({ model: 'gpt-4o-mini', temperature: 0 });
+const llm = new ChatOpenAI({ model: 'gpt-5-mini', temperature: 0 });
 
 // 2) Build your existing LangChain-compatible history
 const history = [
diff --git a/examples/LANGCHAIN_EXAMPLE.md b/examples/LANGCHAIN_EXAMPLE.md
index b5e9dbf..7d1260f 100644
--- a/examples/LANGCHAIN_EXAMPLE.md
+++ b/examples/LANGCHAIN_EXAMPLE.md
@@ -12,7 +12,7 @@ import {
 import { ChatOpenAI } from '@langchain/openai'; // or any LangChain chat model
 
 // Create a LangChain model (reads from env, e.g., OPENAI_API_KEY)
-const lc = new ChatOpenAI({ model: 'gpt-4o-mini', temperature: 0 });
+const lc = new ChatOpenAI({ model: 'gpt-5-mini', temperature: 0 });
 
 class LangChainModel implements SlimContextChatModel {
   async invoke(messages: SlimContextMessage[]): Promise<SlimContextModelResponse> {
diff --git a/examples/OPENAI_EXAMPLE.md b/examples/OPENAI_EXAMPLE.md
index 467d67e..98fdb45 100644
--- a/examples/OPENAI_EXAMPLE.md
+++ b/examples/OPENAI_EXAMPLE.md
@@ -16,7 +16,7 @@ const client = new OpenAI();
 class OpenAIModel implements SlimContextChatModel {
   async invoke(msgs: SlimContextMessage[]): Promise<SlimContextModelResponse> {
     const response = await client.chat.completions.create({
-      model: 'gpt-4o-mini',
+      model: 'gpt-5-mini',
       messages: msgs.map((m) => ({
         role: m.role === 'human' ? 'user' : (m.role as 'system' | 'user' | 'assistant'),
         content: m.content,
@@ -42,7 +42,7 @@ async function main() {
   const compressed = await summarize.compress(history);
 
   const completion = await client.chat.completions.create({
-    model: 'gpt-4o-mini',
+    model: 'gpt-5-mini',
     messages: compressed
       .filter((m) => m.role !== 'tool')
       .map((m) => ({ role: m.role as 'system' | 'user' | 'assistant', content: m.content })),

From 9305f5c4823d9d057e8d28b3f208f28b7326b061 Mon Sep 17 00:00:00 2001
From: Ali Ibrahim Jr <48456829+IBJunior@users.noreply.github.com>
Date: Fri, 29 Aug 2025 12:04:55 +0200
Subject: [PATCH 5/6] fix: fixed repo url

---
 package.json | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/package.json b/package.json
index d0a1957..61c47b9 100644
--- a/package.json
+++ b/package.json
@@ -34,12 +34,12 @@
   "license": "MIT",
   "repository": {
     "type": "git",
-    "url": "git+https://github.com/Agentailor/slimcontext.git"
+    "url": "git+https://github.com/agentailor/slimcontext.git"
   },
   "bugs": {
-    "url": "https://github.com/Agentailor/slimcontext/issues"
+    "url": "https://github.com/agentailor/slimcontext/issues"
   },
-  "homepage": "https://github.com/Agentailor/slimcontext#readme",
+  "homepage": "https://github.com/agentailor/slimcontext#readme",
   "packageManager": "pnpm@10.14.0",
   "peerDependencies": {
     "@langchain/core": ">=0.3.71 <1"

From 066130df0149ed6b6c96a299f9012fa1cda0cbd0 Mon Sep 17 00:00:00 2001
From: Ali Ibrahim Jr <48456829+IBJunior@users.noreply.github.com>
Date: Fri, 29 Aug 2025 16:24:51 +0200
Subject: [PATCH 6/6] fix: fixed version number

---
 CHANGELOG.md | 38 ++++++++++++++------------------------
 package.json |  2 +-
 2 files changed, 15 insertions(+), 25 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 714473d..6db9fae 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -4,29 +4,7 @@ All notable changes to this project will be documented in this file.
 
 The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
 
-## [2.1.0] - 2025-08-27
-
-### Added
-
-- LangChain adapter under `src/adapters/langchain.ts` with helpers:
-  - `extractContent`, `roleFromMessageType`, `baseToSlim`, `slimToLangChain`
-  - `toSlimModel(llm)` wrapper to use LangChain `BaseChatModel` with `SummarizeCompressor`.
-  - `compressLangChainHistory(history, options)` high-level helper for one-call compression on `BaseMessage[]`.
-- Tests for adapter behavior in `tests/langchain.test.ts`.
-- Examples:
-  - `examples/LANGCHAIN_EXAMPLE.md`: adapting a LangChain model to `SlimContextChatModel`.
-  - `examples/LANGCHAIN_COMPRESS_HISTORY.md`: using `compressLangChainHistory` directly.
-
-### Changed
-
-- README updated with a LangChain adapter section and one-call usage sample.
-
-### Notes
-
-- The adapter treats LangChain `tool` messages as `assistant` during compression.
-- `@langchain/core` is an optional peer dependency; only needed if you use the adapter.
-
-## [2.2.0] - 2025-08-28
+## [2.1.0] - 2025-08-28
 
 ### Breaking
 
@@ -49,10 +27,18 @@ The format is based on Keep a Changelog, and this project adheres to Semantic Ve
 
 ### Added
 
+- LangChain adapter under `src/adapters/langchain.ts` with helpers:
+  - `extractContent`, `roleFromMessageType`, `baseToSlim`, `slimToLangChain`
+  - `toSlimModel(llm)` wrapper to use LangChain `BaseChatModel` with `SummarizeCompressor`.
+  - `compressLangChainHistory(history, options)` high-level helper for one-call compression on `BaseMessage[]`.
+- Tests for adapter behavior in `tests/langchain.test.ts`.
+- Examples:
+  - `examples/LANGCHAIN_EXAMPLE.md`: adapting a LangChain model to `SlimContextChatModel`.
+  - `examples/LANGCHAIN_COMPRESS_HISTORY.md`: using `compressLangChainHistory` directly.
 - `TokenEstimator` type for custom token estimation.
 - Docs and examples updated to reflect token-based configuration.
 
-## [2.0.0] - 2025-08-24
+## [2.0.1] - 2025-08-24
 
 ### Breaking
 
@@ -89,3 +75,7 @@ The format is based on Keep a Changelog, and this project adheres to Semantic Ve
 ### Behavior
 
 - SummarizeCompressor alignment: after summarization, the first kept message following the summary is enforced to be a `user` message to maintain dialogue consistency. To achieve this while preserving recent context, the resulting message count may be `maxMessages - 1`, `maxMessages`, or `maxMessages + 1` depending on the split position.
+
+### Notes
+
+- `@langchain/core` is an optional peer dependency; only needed if you use the adapter.
diff --git a/package.json b/package.json
index 61c47b9..ca575a4 100644
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
 {
   "name": "slimcontext",
-  "version": "2.2.0",
+  "version": "2.1.0",
   "description": "Lightweight, model-agnostic chat history compression (trim + summarize) for AI assistants.",
   "main": "dist/index.js",
   "types": "dist/index.d.ts",