-
Notifications
You must be signed in to change notification settings - Fork 2
Command Vector
Rick Hightower edited this page Feb 2, 2026
·
1 revision
name: agent-brain-vector description: Search using semantic vector similarity for concepts parameters:
- name: query description: The conceptual search query required: true
- name: top-k description: Number of results to return (1-20) required: false default: 5
- name: threshold description: Minimum similarity score (0.0-1.0) required: false default: 0.3 skills:
- using-agent-brain
Performs semantic vector similarity search using embeddings. This mode understands meaning and concepts, finding relevant content even when exact terms don't match.
Vector search is ideal for:
- Conceptual questions ("how does X work")
- Natural language queries
- Finding related content
- Questions about purpose or design
- When exact terms are unknown
/agent-brain-vector <query> [--top-k <n>] [--threshold <t>]
| Parameter | Required | Default | Description |
|---|---|---|---|
| query | Yes | - | The conceptual search query |
| --top-k | No | 5 | Number of results (1-20) |
| --threshold | No | 0.3 | Minimum similarity (0.0-1.0) |
| Use Vector | Use BM25 Instead |
|---|---|
| "how does authentication work" | "AuthenticationError" |
| "best practices for caching" | "LRUCache" |
| "explain the deployment process" | "deploy.yml" |
| "security considerations" | "CVE-2024-1234" |
| "similar to user validation" | "validate_user" |
# Verify server is running
agent-brain statusIf not running:
agent-brain start --daemonagent-brain query "<query>" --mode vector --top-k <k> --threshold <t># Conceptual question
agent-brain query "how does the caching system work" --mode vector
# Natural language
agent-brain query "best practices for handling errors" --mode vector
# Find related content
agent-brain query "similar to user authentication flow" --mode vector
# Lower threshold for more results
agent-brain query "security considerations" --mode vector --threshold 0.2
# More results
agent-brain query "explain the API design" --mode vector --top-k 10For each result, present:
- Source: File path or document name
- Score: Semantic similarity score (0-1, higher is better)
- Content: Relevant excerpt from the document
## Vector Search Results for "how does caching work"
### 1. docs/architecture/caching.md (Score: 0.92)
## Caching Architecture
The system implements a multi-tier caching strategy:
1. **L1 Cache (In-Memory LRU)**: Fast access for frequently used data
2. **L2 Cache (Redis)**: Distributed cache for cross-instance sharing
3. **L3 Cache (CDN)**: Edge caching for static assets
Cache invalidation uses a write-through strategy...
### 2. docs/performance/optimization.md (Score: 0.78)
## Caching for Performance
Proper cache configuration can improve response times by 10-100x.
**Configuration options:**
- TTL by resource type
- Cache warming strategies
- Invalidation webhooks
### 3. src/cache/redis_client.py (Score: 0.71)
class RedisCache:
"""
Redis-based distributed cache implementation.
Provides connection pooling, automatic reconnection,
and configurable TTL per key prefix.
"""
---
Found 3 results above threshold 0.3
Search mode: vector (semantic)
Response time: 1247ms
When referencing results in responses:
- "The caching architecture is documented in
docs/architecture/caching.md..." - "According to
docs/performance/optimization.md..."
Error: Could not connect to Agent Brain server
Resolution:
agent-brain start --daemonNo results found above threshold 0.3
Resolution:
- Try lowering threshold:
--threshold 0.1 - Rephrase the query
- Use hybrid mode for mixed queries:
--mode hybrid - Verify documents are indexed:
agent-brain status
Error: OPENAI_API_KEY not set
Resolution:
export OPENAI_API_KEY="sk-proj-..."
# Or use local embeddings:
export EMBEDDING_PROVIDER=ollama
export EMBEDDING_MODEL=nomic-embed-textError: Failed to generate embedding
Resolution:
- Check API key is valid
- For Ollama: ensure model is pulled and server is running
- Verify embedding provider configuration:
agent-brain verify
Warning: No documents indexed
Resolution:
agent-brain index /path/to/docs| Metric | Typical Value |
|---|---|
| Latency | 800-1500ms |
| API calls | 1 embedding call per query |
| Best for | Conceptual queries, natural language |
| Aspect | Vector | BM25 | Hybrid |
|---|---|---|---|
| Speed | Slow | Fast | Medium |
| Exact terms | Poor | Excellent | Good |
| Concepts | Excellent | Poor | Good |
| API cost | Per query | Free | Per query |
Vector search quality depends on:
- Embedding model: text-embedding-3-large > small > ada-002
- Document quality: Well-written docs match better
- Query phrasing: Natural language works best
- Threshold setting: Lower for more recall, higher for precision
- Query embedding: Your query is converted to a vector (e.g., 3072 dimensions for OpenAI large)
- Similarity calculation: Cosine similarity computed against all indexed document vectors
- Ranking: Documents sorted by similarity score
- Filtering: Results below threshold are excluded
similarity(q, d) = dot(q, d) / (||q|| * ||d||)
-
/agent-brain-bm25- Pure keyword search -
/agent-brain-hybrid- Combined BM25 + semantic -
/agent-brain-semantic- Alias for vector search -
/agent-brain-multi- Multi-mode fusion search
- Design-Architecture-Overview
- Design-Query-Architecture
- Design-Storage-Architecture
- Design-Class-Diagrams
- GraphRAG-Guide
- Agent-Skill-Hybrid-Search-Guide
- Agent-Skill-Graph-Search-Guide
- Agent-Skill-Vector-Search-Guide
- Agent-Skill-BM25-Search-Guide
Search
Server
Setup
- Pluggable-Providers-Spec
- GraphRAG-Integration-Spec
- Agent-Brain-Plugin-Spec
- Multi-Instance-Architecture-Spec