-
Notifications
You must be signed in to change notification settings - Fork 2
Memory System
How the bot persists and recalls conversation context across sessions.
See also: RAG Integration for long-term semantic memory via knowledge graphs, Configuration for environment variables, Model Routing for context overflow and compaction.
The bot has a multi-layered memory system that ensures continuity across conversations:
+--------------------------------------------------------------+
| System Prompt |
| |
| # Memory <-- MemoryService |
| ## Long-term Memory <-- MEMORY.md (persistent) |
| ## Today's Notes <-- 2026-02-07.md (daily log) |
| ## Recent Context <-- last 7 days of notes |
| |
| # Relevant Memory <-- RAG (see RAG Integration) |
| |
| [conversation messages] <-- SessionService (in-memory) |
+--------------------------------------------------------------+
| Layer | Storage | Scope | Survives /new? |
Written By |
|---|---|---|---|---|
| Session messages | sessions/{id}.json |
Current conversation | No (cleared) | SessionService |
| Daily notes | memory/YYYY-MM-DD.md |
Today + recent N days | Yes | MemoryPersistSystem |
| Long-term memory | memory/MEMORY.md |
Permanent | Yes | LLM (via tools) or user |
| RAG | LightRAG knowledge graph | All past conversations | Yes | RagIndexingSystem |
This guide covers the first three layers. For RAG, see RAG Integration.
The most immediate form of memory -- the full conversation history within the current session.
| Field | Type | Description |
|---|---|---|
id |
string | Format: {channelType}:{chatId} (e.g., telegram:12345) |
channelType |
string | Channel identifier (e.g., telegram) |
chatId |
string | Chat identifier within channel |
messages |
List<Message> | Full conversation history |
metadata |
Map | Session metadata |
state |
enum |
ACTIVE, PAUSED, TERMINATED
|
createdAt |
Instant | Session creation time |
updatedAt |
Instant | Last message time |
Source:
AgentSession.java
Manages session lifecycle with in-memory caching and filesystem persistence.
Storage: sessions/{channelType}:{chatId}.json
Key operations:
| Method | Description |
|---|---|
getOrCreate(channelType, chatId) |
Lazy load from disk or create new session |
save(session) |
Persist to disk + update cache |
delete(sessionId) |
Remove from cache and disk |
clearMessages(sessionId) |
Wipe message history (used by /new command) |
getMessageCount(sessionId) |
Count messages in session |
Sessions are cached in a ConcurrentHashMap for fast access. Disk persistence happens on every save() call.
Source:
SessionService.java
When conversation history grows too large for the model's context window, it can be compacted:
Manual compaction -- /compact [N] command:
-
CompactionService.summarize()sends old messages to the default (balanced) LLM model - LLM produces a concise summary
-
SessionService.compactWithSummary()replaces old messages with a summary message + last N messages
Automatic compaction -- AutoCompactionSystem (order=18):
- Estimates token count:
sum(message.length) / 3.5 + 8000(system prompt overhead) - Compares against model's
maxInputTokens * 0.8frommodels.json - If exceeded, triggers the same compaction flow
Summary message format:
[Conversation summary]
User discussed deploying a Spring Boot app with Docker. Key decisions:
- Using Jib for image builds (multi-stage, ~180MB)
- Docker Compose for orchestration with health checks
- LightRAG container alongside the main bot
The summary is stored as a system role message at the beginning of the conversation history.
CompactionService details:
- Model: balanced tier (low reasoning, temperature 0.3)
- Max output: 500 tokens
- Timeout: 15 seconds
- Filters out tool result messages for cleaner summaries
- Truncates individual messages to 300 chars before summarization
- Falls back to simple truncation (drop oldest, keep last N) if LLM unavailable
Source:
CompactionService.java,SessionService.java,AutoCompactionSystem.javaSee: Model Routing for the full 3-layer context overflow protection.
MemoryPersistSystem (order=50) automatically logs each conversation exchange as a timestamped note in a daily file.
After every LLM response, the system:
- Extracts the last user message and the LLM response
- Truncates: user to 200 chars, assistant to 300 chars
- Replaces newlines with spaces
- Formats as:
[HH:mm] User: {text} | Assistant: {text} - Appends to today's file:
memory/YYYY-MM-DD.md
Example daily file (memory/2026-02-07.md):
[14:15] User: How do I configure Docker health checks? | Assistant: Add a healthcheck section to your docker-compose.yml with test, interval, and timeout...
[14:30] User: What about restart policies? | Assistant: Use restart: unless-stopped for production. This restarts the container unless explicitly stopped...
[14:45] User: Show me the full compose file | Assistant: Here's a complete docker-compose.yml with health checks, restart policies, and volume mounts...
Source:
MemoryPersistSystem.java
If persistence fails (e.g., disk full, permission denied), the error is logged as a warning and the response pipeline continues. Memory persistence is fail-safe -- it never blocks the user from getting a response.
MEMORY.md is a persistent file that survives conversation resets and contains curated knowledge the LLM should always remember.
Path: memory/MEMORY.md (within workspace: ~/.golemcore/workspace/memory/MEMORY.md)
On every request, ContextBuildingSystem loads MEMORY.md content and includes it under ## Long-term Memory in the system prompt. This gives the LLM persistent knowledge across all sessions.
MEMORY.md can be updated in two ways:
-
By the LLM -- using the
filesystemtool to write to the memory directory (if the skill or prompt instructs it to maintain persistent notes) -
By the user -- manually editing the file at
~/.golemcore/workspace/memory/MEMORY.md
The MemoryComponent interface provides writeLongTerm(content) and readLongTerm() methods.
MemoryService.getMemoryContext() calls Memory.toContext() which formats all memory layers into markdown for the system prompt:
## Long-term Memory
User prefers concise responses.
Project uses Spring Boot 4.0.2 with Java 17.
Database is PostgreSQL on port 5432.
## Today's Notes
[14:15] User: How do I configure health checks? | Assistant: Add healthcheck section...
[14:30] User: What about restart? | Assistant: Use restart: unless-stopped...
## Recent Context
### 2026-02-06
[10:00] User: Set up CI/CD pipeline | Assistant: Created GitHub Actions workflow...
[10:30] User: Add test stage | Assistant: Added test job with mvn test...
### 2026-02-05
[16:00] User: Initialize project | Assistant: Created Spring Boot project with...Sections included:
-
## Long-term Memory-- only ifMEMORY.mdhas content -
## Today's Notes-- only if today's file exists and has content -
## Recent Context-- includes the last N days (default 7), with each day as a###subsection. Only days with content are shown.
If all sections are empty, toContext() returns an empty string and the # Memory header is not included in the system prompt.
Source:
Memory.java,MemoryService.java
~/.golemcore/workspace/
+-- memory/
| +-- MEMORY.md # Long-term persistent memory
| +-- 2026-02-07.md # Today's notes
| +-- 2026-02-06.md # Yesterday's notes
| +-- 2026-02-05.md # ...
| +-- ... # Up to recent-days files loaded
|
+-- sessions/
+-- telegram:12345.json # Session with messages, metadata
+-- ...
Runtime config (stored in preferences/runtime-config.json):
{
"memory": {
"enabled": true,
"recentDays": 7
},
"compaction": {
"enabled": true,
"maxContextTokens": 50000,
"keepLastMessages": 20
}
}Storage paths/directories are Spring properties (see Configuration).
| Order | System | Memory Behavior |
|---|---|---|
| 18 | AutoCompactionSystem |
Compacts session messages if context too large |
| 20 | ContextBuildingSystem |
Reads memory context (MEMORY.md + daily notes + recent days) and injects into system prompt |
| 30 | ToolLoopExecutionSystem |
LLM call + tool execution; system prompt includes # Memory section; LLM can write to MEMORY.md via filesystem tool |
| 50 | MemoryPersistSystem |
Writes today's notes -- appends `[HH:mm] User: ... |
| 55 | RagIndexingSystem |
Indexes exchange to LightRAG (separate from this memory system) |
The read path (order=20) runs before the write path (order=50), so the LLM sees the memory state before the current exchange is recorded.
| Class | Package | Purpose |
|---|---|---|
Memory |
domain.model |
Data model: longTermContent, todayNotes, recentDays; toContext() formatting |
MemoryComponent |
domain.component |
Interface: getMemory, readLongTerm, writeLongTerm, readToday, appendToday |
MemoryService |
domain.service |
Implementation: loads from storage, builds Memory, formats context |
MemoryPersistSystem |
domain.system |
Order=50: appends conversation exchanges to daily notes |
SessionService |
domain.service |
Session CRUD, message history, compaction operations |
CompactionService |
domain.service |
LLM-powered summarization for context overflow |
AutoCompactionSystem |
domain.system |
Order=18: automatic compaction when context exceeds threshold |
AgentSession |
domain.model |
Session model: id, messages, state, timestamps |
[Context] Memory context: 1250 chars
[MemoryPersist] Appended memory entry (85 chars)
[AutoCompact] Context too large: ~150000 tokens (threshold 102400), 45 messages. Compacting...
[AutoCompact] Compacted with LLM summary: removed 35 messages, kept 10
| Command | What It Shows |
|---|---|
/status |
Session message count, memory state |
/compact [N] |
Manually compact conversation, keep last N messages |
/stop |
Interrupt the current run (messages will be queued until your next message) |
/new |
Clear session messages (memory files preserved) |
See also: RAG Integration, Configuration, Model Routing
GolemCore Bot -- Apache License 2.0 | GitHub | Issues | Discussions
Getting Started
Core Concepts
Features
Reference
Development