Skip to content

Add long-term memory system with hybrid search#8

Merged
initializ-mk merged 1 commit intomainfrom
core/memory
Feb 26, 2026
Merged

Add long-term memory system with hybrid search#8
initializ-mk merged 1 commit intomainfrom
core/memory

Conversation

@initializ-mk
Copy link
Contributor

Summary

  • Embedder interface + providers: Embedder abstraction with OpenAI (text-embedding-3-small), Gemini, and Ollama (nomic-embed-text) implementations, plus factory with auto-detection and Anthropic fallback handling
  • Long-term memory package (forge-core/memory/): file-based storage (daily logs + curated MEMORY.md), text chunker with paragraph/sentence-aware overlap, FileVectorStore with pluggable VectorStore interface, and hybrid search engine combining vector cosine similarity, keyword overlap, and temporal decay (7-day half-life, MEMORY.md evergreen)
  • Memory tools: memory_search and memory_get builtin tools enabling agents to query their own long-term memory
  • Compactor integration: MemoryFlusher hook captures tool results and decisions before compaction discards old messages, writing them to the daily observation log
  • Runner wiring: embedder auto-resolution from LLM provider config, memory.Manager lifecycle, conditional tool registration, and background indexing at startup
  • Graceful degradation: no embedder → keyword-only search, no memory dir → skip without crash, corrupted index → rebuild from source files

Enable via memory.long_term: true in forge.yaml or FORGE_MEMORY_LONG_TERM=true env var.

Test plan

  • Embedder factory tests (provider routing, defaults)
  • FileStore tests (read/write, path traversal prevention, daily logs, MEMORY.md template)
  • Chunker tests (paragraph splitting, overlap, empty/large text, unique IDs)
  • VectorStore tests (index, search, delete, persistence round-trip, cosine similarity)
  • Hybrid search tests (vector+keyword, keyword-only, temporal decay, tokenization)
  • Manager tests (full pipeline, mock embedder, graceful degradation on empty index)
  • Memory tools tests (memory_search, memory_get with valid/invalid inputs)
  • Compactor MemoryFlusher tests (observation capture, SetMemoryFlusher wiring)
  • All existing tests pass across forge-core, forge-cli, forge-plugins
  • Zero golangci-lint issues across all modules

Implement persistent cross-session agent memory inspired by OpenClaw's
memory architecture. Agents can now accumulate knowledge across tasks
via daily observation logs and a curated MEMORY.md file.

Key components:
- Embedder interface with OpenAI, Gemini, and Ollama providers
- File-based memory store with daily logs and evergreen MEMORY.md
- Text chunker with paragraph/sentence-aware overlap splitting
- FileVectorStore (JSON-backed, pluggable VectorStore interface)
- Hybrid search: vector cosine similarity + keyword overlap + temporal
  decay (7-day half-life, MEMORY.md exempt)
- memory_search and memory_get builtin tools for agent self-service
- MemoryFlusher hook in compactor to capture observations before discard
- Full runner integration with embedder auto-detection and fallback

Graceful degradation: no embedder → keyword-only search, no memory dir
→ skip without crash, corrupted index → rebuild from source files.
@initializ-mk initializ-mk merged commit b9d9d07 into main Feb 26, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant