Implement vector search with dual query interfaces#14
Implement vector search with dual query interfaces#14
Conversation
- Implemented SearchService with support for Pinecone, Weaviate, and Chroma - Added searchByVector and searchByText methods (both ways to find items) - Created comprehensive tests for all search functionality - Integrated search service into main application - All tests passing Co-authored-by: lippytm <65956507+lippytm@users.noreply.github.com>
- Added Vector Search section to README - Documented both search methods (vector and text) - Updated project structure to include search module - Added usage examples Co-authored-by: lippytm <65956507+lippytm@users.noreply.github.com>
- Added detailed documentation for embedding model options - Enhanced tests to verify service usability after initialization - All tests passing Co-authored-by: lippytm <65956507+lippytm@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a new SearchService abstraction intended to unify vector database search across Pinecone/Weaviate/Chroma, with two query entry points (vector and text) and app-level lifecycle integration.
Changes:
- Introduces
src/search/index.tswithVectorDatabaseinterface, adapters, a factory, andSearchService. - Hooks search initialization/teardown into
src/index.tsbased onconfig.vectorDb.enabled. - Adds documentation and a new test suite for the search module.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| test/search.test.js | Adds tests for SearchService and the adapter factory. |
| src/search/index.ts | Implements SearchService, adapter stubs, and a database factory. |
| src/index.ts | Conditionally initializes/closes SearchService in main + signal handlers. |
| README.md | Documents vector search configuration and usage; updates project structure. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| */ | ||
| async close(): Promise<void> { | ||
| if (this.database) { | ||
| await this.database.close(); |
There was a problem hiding this comment.
SearchService.close() closes the underlying adapter but leaves this.database set. Because adapters auto-initialize in search(), calling searchByVector/searchByText after close() will silently re-open and proceed, which is surprising for a closed service. Consider clearing this.database (and/or tracking an explicit initialized/closed state) so searches after close reliably fail until initialize() is called again.
| await this.database.close(); | |
| await this.database.close(); | |
| this.database = undefined; |
| throw new Error('Search service not initialized'); | ||
| } | ||
|
|
||
| logger.info('Searching by text', { text, topK }); |
There was a problem hiding this comment.
searchByText logs the raw query text at info level. Search queries commonly include user-provided content and may contain sensitive data; logging the full text can create unnecessary data exposure. Consider logging only non-sensitive metadata (e.g., topK, text length) and/or lowering this to debug with redaction.
| logger.info('Searching by text', { text, topK }); | |
| logger.info('Searching by text', { topK, textLength: text.length }); | |
| logger.debug('Searching by text (redacted)', { | |
| topK, | |
| textLength: text.length, | |
| textPreview: text.length > 32 ? text.slice(0, 32) + '...' : text, | |
| }); |
| export interface SearchQuery { | ||
| vector?: number[]; | ||
| text?: string; | ||
| topK?: number; | ||
| filter?: Record<string, unknown>; | ||
| } |
There was a problem hiding this comment.
SearchQuery allows both vector and text to be omitted (or both to be provided), but VectorDatabase.search() doesn’t enforce any invariants. That makes it easy for callers (or future adapter implementations) to accept invalid/ambiguous queries. Consider validating that exactly one of vector/text is provided (and returning a clear error) at the SearchService layer and/or in adapter implementations.
| ## Vector Search | ||
|
|
||
| ### Search Functionality | ||
|
|
||
| The project provides a unified search interface across multiple vector database backends. You can find items using **both ways**: | ||
|
|
||
| 1. **Vector-based search**: Search using embedding vectors | ||
| 2. **Text-based search**: Search using natural language text queries | ||
|
|
||
| **Supported Vector Databases:** | ||
|
|
||
| - **Pinecone**: Cloud-native vector database | ||
| - **Weaviate**: On-premise or cloud vector database | ||
| - **Chroma**: Lightweight local/embedded vector database | ||
|
|
||
| **Enable Vector Search:** | ||
|
|
||
| ```bash | ||
| # Set in .env file | ||
| VECTOR_DB_ENABLED=true | ||
| VECTOR_DB_TYPE=chroma # Options: pinecone, weaviate, chroma | ||
| VECTOR_DB_ENDPOINT=http://localhost:8000 # Optional - database endpoint | ||
| VECTOR_DB_API_KEY=your-api-key # Optional - for cloud services | ||
| ``` | ||
|
|
||
| **Usage Example:** | ||
|
|
||
| ```typescript | ||
| import { SearchService } from './search'; | ||
|
|
||
| // Initialize the search service | ||
| const searchService = new SearchService({ | ||
| enabled: true, | ||
| type: 'chroma', | ||
| endpoint: 'http://localhost:8000', | ||
| }); | ||
| await searchService.initialize(); | ||
|
|
||
| // Search by vector (first way) | ||
| const vectorResults = await searchService.searchByVector([0.5, 0.5, 0.5], 10); | ||
|
|
||
| // Search by text (second way) | ||
| const textResults = await searchService.searchByText('find this item', 10); | ||
|
|
||
| // Clean up | ||
| await searchService.close(); | ||
| ``` |
There was a problem hiding this comment.
The README describes vector search as a feature with a working usage example, but the current adapters are stubs that always return an empty result set (return []). Consider explicitly documenting that the database integrations are placeholders/TODOs and that searches currently return no results until a real client implementation is added, to avoid misleading users.
| process.on('SIGTERM', async () => { | ||
| logger.info('SIGTERM received, shutting down gracefully...'); | ||
| if (searchService) { | ||
| await searchService.close(); | ||
| } | ||
| await shutdownTelemetry(); | ||
| process.exit(0); | ||
| }); | ||
|
|
||
| process.on('SIGINT', async () => { | ||
| logger.info('SIGINT received, shutting down gracefully...'); | ||
| if (searchService) { | ||
| await searchService.close(); | ||
| } | ||
| await shutdownTelemetry(); | ||
| process.exit(0); |
There was a problem hiding this comment.
The SIGTERM/SIGINT handlers are async, but any rejection from searchService.close() or shutdownTelemetry() will become an unhandled promise rejection (event listeners aren’t awaited). Consider wrapping the shutdown sequence in a try/catch (and possibly finally) to ensure errors are logged and the process exits cleanly.
| WeaviateAdapter, | ||
| ChromaAdapter, | ||
| createVectorDatabase, | ||
| } from '../dist/search/index.js'; |
There was a problem hiding this comment.
This test imports the built artifact from ../dist/search/index.js, but npm test runs node --test without building first (and dist/ is excluded from the repo). This will fail in a clean checkout. Consider either adding a build step before tests (e.g., pretest/test script) or adjusting the tests to run against compiled output in a deterministic way.
| } from '../dist/search/index.js'; | |
| } from '../src/search/index.js'; |
Configuration existed for vector databases (Pinecone, Weaviate, Chroma) but no search implementation. Added SearchService supporting both vector-based and text-based queries.
Changes
Core Implementation
SearchServicewith factory pattern for database adapterssearchByVector(vector, topK)andsearchByText(text, topK)Database Adapters
PineconeAdapter,WeaviateAdapter,ChromaAdapterimplementing commonVectorDatabaseinterfaceIntegration
VECTOR_DB_ENABLED=trueUsage
Notes
Text-to-vector conversion requires embedding model integration (HuggingFace Transformers, OpenAI, Cohere). Documentation added for common options and dimension requirements.
Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.