feat: add embedding support with MLX integration and vector search foundation#48
Open
feat: add embedding support with MLX integration and vector search foundation#48
Conversation
…undation Add foundation for embedding-based semantic search with platform-aware MLX support. ## Key Features ### Embedding Infrastructure - Add `embedding` column (BLOB, nullable) to messages table for 768-dimensional vectors - Update Message model with optional `embedding: Option<Vec<f32>>` field - Implement embedding serialization/deserialization (f32 ↔ bytes) - Add `with_embedding()` builder method to Message ### Embedding Service - Create `EmbeddingService` with platform detection - Support for MLX on macOS (via `RETROCHAT_USE_MLX` env var) - Deterministic dummy 768-dimensional embeddings for development - L2-normalized vectors for consistency - Graceful warnings on unsupported platforms (Windows/Linux) - Full test coverage (4/4 tests passing) ### CLI Enhancement - Add `--use-embedding` flag to search commands - Available in both `retrochat search` and `query search` - Sets `search_type = "embedding"` for service layer routing ### Dependencies - Add `sqlite-vec` v0.1.6 for vector similarity search - Add `mlx-rs` v0.25 (macOS-only, optional) for future ML integration - Create `mlx` feature flag for conditional compilation ## Implementation Details **Database Layer** (`src/database/message_repo.rs`): - `embedding_to_blob()` - Convert f32 vectors to bytes - `blob_to_embedding()` - Convert bytes back to f32 vectors - Updated all INSERT/SELECT queries to handle embedding column **Service Layer** (`src/services/embedding_service.rs`): - Platform-aware initialization with MLX support detection - Deterministic hash-based dummy embeddings (768 dims) - Ready for actual MLX model integration **Environment** (`src/env.rs`): - `RETROCHAT_USE_MLX` - Enable MLX embeddings (macOS only) - Proper documentation for platform requirements ## Migration - `008_add_message_embeddings.sql` - Adds nullable embedding column - Backwards compatible (existing data works without embeddings) ## Usage ```bash # Enable MLX embeddings on macOS export RETROCHAT_USE_MLX=true # Use embedding-based search retrochat search "machine learning" --use-embedding # With time range retrochat search "performance" --use-embedding --since "7 days ago" ``` ## Testing - ✅ Embedding service tests passing (4/4) - ✅ CLI command structure tests updated - ✅ Platform detection tests -⚠️ Search integration tests require migration run ## Next Steps - Implement vector similarity search using sqlite-vec - Add actual MLX model inference for embedding generation - Route search requests to vector search when `use_embedding` is true - Integrate embedding generation into message import flow 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Resolved conflicts in: - Cargo.toml: Combined both sets of dependencies (sqlite-vec, regex, lazy_static, mlx-rs) - Cargo.lock: Regenerated after dependency merge - src/database/message_repo.rs: Merged to include both message_type/tool_operation_id from main and embedding from feature branch All SQL queries now include the full set of fields: - message_type and tool_operation_id (from main) - embedding (from feature/embedding) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…grations After merging main, we had two migration files numbered 008: - 008_add_tool_operations.sql (from main) - 008_add_message_embeddings.sql (from feature/embedding) Renamed our embedding migration to 011_add_message_embeddings.sql to maintain proper migration sequence. All tests now pass. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Clippy fixes: - Replace manual `% 4 != 0` with `.is_multiple_of(4)` in blob_to_embedding - Box Message fields in MessageGroup::ToolPair to reduce enum size (416 bytes → smaller) SQL query fixes: - Add `embedding` column to all SELECT queries in message_repo.rs: - search_content_with_filters - search_content_with_time_filters - get_by_time_range This ensures all queries return the complete Message structure including the new embedding field from migration 011. All CI checks now pass: formatting, clippy, and tests. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add foundation for embedding-based semantic search with comprehensive infrastructure for 768-dimensional vectors, platform-aware MLX support, and CLI integration.
Key Features
🔢 Embedding Infrastructure
embeddingBLOB column to messages table (768 dimensions)Option<Vec<f32>>embedding fieldwith_embedding()helper method🤖 Embedding Service (
src/services/embedding_service.rs)RETROCHAT_USE_MLXenv var🔍 CLI Integration
--use-embeddingflag to search commands:retrochat search <query> --use-embeddingretrochat query search <query> --use-embeddingsearch_type = "embedding"for service routing📦 Dependencies
mlxfeature flag for conditional compilationImplementation Details
Database Layer
File:
src/database/message_repo.rsembedding_to_blob(): Convert Vec → bytes for storageblob_to_embedding(): Convert bytes → Vec for retrievalService Layer
File:
src/services/embedding_service.rsFeatures:
Environment Configuration
File:
src/env.rsMigration
File:
migrations/008_add_message_embeddings.sqlUsage Examples
Testing
Passing Tests ✅
test_dummy_embedding_generation- 768-dim vectors generated correctlytest_embedding_deterministic- Same input → same outputtest_embedding_different_text- Different inputs → different outputstest_platform_support_check- Platform detection worksCLI Tests Updated ✅
test_search_command_structure- Includesuse_embeddingfieldtest_search_command_with_time_range- Time + embedding flagstest_search_command_with_embedding- New test for embedding flagKnown Issues⚠️
Technical Notes
Embedding Format
Platform Support
Dummy Embedding Algorithm
Architecture
Next Steps (Future PRs)
Vector Search Implementation
MLX Integration
Embedding Generation Flow
Search Enhancement
Files Changed
New Files
migrations/008_add_message_embeddings.sql- Database migrationsrc/services/embedding_service.rs- Embedding generation serviceModified Files
Cargo.toml- Add dependencies (sqlite-vec, mlx-rs)Cargo.lock- Dependency lock filesrc/cli/mod.rs- Add --use-embedding flagsrc/cli/query.rs- Update search handlersrc/database/message_repo.rs- Embedding storage/retrievalsrc/env.rs- Add RETROCHAT_USE_MLXsrc/models/message.rs- Add embedding fieldsrc/services/mod.rs- Export EmbeddingServicetests/contract/test_cli_add_command.rs- Update CLI testsBreaking Changes
None - all changes are additive and backwards compatible.
🤖 Generated with Claude Code