Skip to content
Rick Hightower edited this page Feb 2, 2026 · 5 revisions

Agent Brain

A production-grade RAG (Retrieval-Augmented Generation) document indexing and semantic search system designed for AI agents and applications.

Overview

Agent Brain is a monorepo containing four packages:

Package Description
agent-brain-server FastAPI REST API for document indexing and semantic search
agent-brain-cli Command-line interface for managing the server
agent-brain-skill Claude Code skill for AI-powered documentation queries
agent-brain-plugin Claude Code plugin with 24 commands, 3 agents, and 2 skills

Key Features

Search Capabilities

  • Hybrid Search: Combines semantic (Vector) + keyword (BM25) with tunable alpha weighting
  • GraphRAG (NEW): Knowledge graph-based retrieval with entity relationships
  • Multi-Mode Search: VECTOR, BM25, HYBRID, GRAPH, and MULTI (RRF fusion)
  • AST-Aware Code Indexing: 9+ languages with tree-sitter parsing

Pluggable Providers (NEW)

  • Embedding Providers: OpenAI, Ollama, Cohere
  • Summarization Providers: Anthropic, OpenAI, Gemini, Grok, Ollama
  • Fully Local Mode: Run completely offline with Ollama

Architecture

  • Multi-Instance Architecture: Per-project isolation with automatic port allocation
  • Singleton Services: Shared service instances for efficiency
  • Async Throughout: All I/O operations are async for performance

Integration

  • REST API: Full OpenAPI-documented REST interface
  • Claude Code Plugin: 24 commands for complete workflow integration
  • Knowledge Agents: Specialized agents for research tasks

Quick Start

# Install the CLI
pip install agent-brain-cli

# Initialize project
agent-brain init

# Start server
agent-brain start

# Index your codebase
agent-brain index .

# Query with semantic search
agent-brain query "how does authentication work"

# Query with GraphRAG (requires ENABLE_GRAPH_INDEX=true)
agent-brain query "what uses UserService" --mode graph

Documentation

Getting Started

Architecture & Design

Core Architecture

Query System

Indexing System

Deployment

Server & CLI

API Reference

Search Guides

Provider Configuration

Plugin & Skill


Plugin Commands

Search Commands

Command Description
Command-Search Multi-mode search with automatic mode selection
Command-Semantic Pure vector/semantic search
Command-Keyword Alias for BM25 keyword search
Command-BM25 BM25 keyword search
Command-Vector Vector (semantic) search
Command-Hybrid Hybrid search (BM25 + Vector)
Command-Graph GraphRAG knowledge graph search
Command-Multi Multi-mode RRF fusion search

Server Management Commands

Command Description
Command-Start Start the Agent Brain server
Command-Stop Stop the running server
Command-Status Check server status
Command-List List running Agent Brain instances
Command-Index Index documents or code
Command-Reset Reset/clear the index

Setup & Configuration Commands

Command Description
Command-Init Initialize project for Agent Brain
Command-Install Install Agent Brain CLI
Command-Setup Interactive setup wizard
Command-Config Configure settings
Command-Verify Verify configuration
Command-Help Show help
Command-Version Show version information

Provider Configuration Commands

Command Description
Command-Providers List and configure providers
Command-Embeddings Configure embedding provider
Command-Summarizer Configure summarization provider

Plugin Agents

Agent Description
Agent-Search-Assistant Intelligent search assistant for complex queries
Agent-Setup-Assistant Guided setup and configuration assistant
Agent-Research-Assistant Research assistant for deep exploration

Feature Specifications

Feature 103 - Pluggable Providers (NEW)

Feature 109 - Multi-Instance Architecture

Feature 113 - GraphRAG Integration

Feature 114 - Agent Brain Plugin

Roadmaps & Planning

Migration & Legacy


Search Modes

Mode Algorithm Best For
VECTOR Cosine similarity Conceptual queries, "how does X work"
BM25 TF-IDF + BM25 Exact terms, error messages, symbols
HYBRID Vector + BM25 (alpha blend) General search (default)
GRAPH Knowledge graph traversal Entity relationships
MULTI RRF over all modes Maximum recall

Supported Providers

Embedding Providers

Provider Models Local
OpenAI text-embedding-3-large, text-embedding-3-small No
Ollama nomic-embed-text, mxbai-embed-large Yes
Cohere embed-english-v3.0, embed-multilingual-v3.0 No

Summarization Providers

Provider Models Local
Anthropic claude-haiku-4-5-20251001, claude-sonnet-4-5-20250514 No
OpenAI gpt-5, gpt-5-mini No
Gemini gemini-3-flash, gemini-3-pro No
Grok grok-4, grok-4-fast No
Ollama llama4:scout, mistral-small3.2, qwen3-coder Yes

Technology Stack

  • Server: FastAPI + Uvicorn
  • Vector Store: ChromaDB (HNSW, cosine similarity)
  • BM25 Index: LlamaIndex BM25Retriever
  • Graph Store: SimplePropertyGraphStore / Kuzu
  • Embeddings: OpenAI text-embedding-3-large (3072 dimensions) or Ollama
  • Summarization: Claude Haiku or Ollama
  • AST Parsing: tree-sitter (9+ languages)
  • CLI: Click + Rich
  • Build System: Poetry

Contributing

See the Developer-Guide for setup instructions.

Before pushing changes, always run:

task before-push

License

MIT License - see LICENSE file for details.

Clone this wiki locally