-
Notifications
You must be signed in to change notification settings - Fork 2
Adds local api for vectorized payload #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is being reviewed by Cursor Bugbot
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
| })() | ||
|
|
||
| // Perform cosine similarity search using Drizzle | ||
| return await performCosineSearch(payload, queryEmbedding, knowledgePool, limit, where) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing knowledge pool validation in direct search function
Medium Severity
The new vectorSearch function doesn't validate that knowledgePool exists in knowledgePools before accessing poolConfig.embedQuery(). When calling payload.search() with an invalid pool name, poolConfig will be undefined, causing a cryptic TypeError: Cannot read properties of undefined (reading 'embedQuery') instead of a clear error message. The requestHandler has this validation (checking if (!poolConfig) and returning a proper error), but the direct function exposed via payload.search() lacks it.
🔬 Verification Test
Why verification test was not possible: This requires a running Payload CMS instance with PostgreSQL and pgvector to test. The bug can be verified by code inspection - comparing lines 70-76 in requestHandler which validates poolConfig exists before proceeding, versus lines 39-42 in vectorSearch which directly accesses poolConfig.embedQuery(query) without any null check. An invalid pool name would cause knowledgePools[knowledgePool] to return undefined, and then calling .embedQuery() on undefined would throw a TypeError.
| })() | ||
|
|
||
| // Perform cosine similarity search using Drizzle | ||
| return await performCosineSearch(payload, queryEmbedding, knowledgePool, limit, where) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty query not validated in direct search API
Low Severity
The new vectorSearch function doesn't validate that query is a non-empty string, unlike the requestHandler which explicitly rejects empty strings with !query || typeof query !== 'string'. When calling payload.search({ query: '', knowledgePool: 'default' }), the empty string passes to poolConfig.embedQuery(''), which may cause unnecessary API calls to the embedding service or unexpected behavior, depending on the implementation. The HTTP endpoint and direct API now behave inconsistently for empty query inputs.
🔬 Verification Test
Why verification test was not possible: This requires a running Payload CMS instance with a configured embedding function to test. The bug can be verified by code inspection - comparing line 60 in requestHandler which checks !query || typeof query !== 'string' and returns a 400 error for empty strings, versus the vectorSearch function (lines 32-47) which has no such validation and directly calls poolConfig.embedQuery(query) with whatever value is passed.
| import type { PostgresAdapterArgs } from '@payloadcms/db-postgres' | ||
| import { createVectorizeTask } from './tasks/vectorize.js' | ||
| import { createVectorSearchHandler } from './endpoints/vectorSearch.js' | ||
| import { createVectorSearchHandlers } from './endpoints/vectorSearch.js' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type guard function not exported from main entry point
High Severity
The new isVectorizedPayload type guard function is added to types.ts but is not re-exported from the main entry point. Line 21 uses export type * from './types.js' which only re-exports types, not runtime values like functions. While isPostgresPayload is explicitly imported as a value on line 14 and thus available, isVectorizedPayload is not imported, making it inaccessible via import { isVectorizedPayload } from 'payloadcms-vectorize' as documented in the README. Users following the documentation will get a runtime import error.
🔬 Verification Test
Why verification test was not possible: This can be verified by code inspection. The export type * syntax in TypeScript only re-exports type declarations, not runtime values. Since isVectorizedPayload is a function (runtime value) and is not explicitly imported and re-exported in index.ts, it will not be available when importing from 'payloadcms-vectorize'. The test file works around this by importing directly from ../../src/types.js rather than from the package entry point.
Allows for VectorizedPayload.search and VectorizedPayload.queueEmbed.
Note
Introduces a local API on the Payload instance for programmatic vector search and embedding queueing.
VectorizedPayloadwithpayload.search(params)andpayload.queueEmbed(params); exposesisVectorizedPayloadtype guardcreateVectorSearchHandlersreturning{ vectorSearch, requestHandler }for reuseonInitto attach local methods; reuses collection hook logic via an internal embed-queue mapREADME.mdandCHANGELOG.mdwith usage docs and examples; bumpspackage.jsonto0.4.5vectorizedPayload.spec.ts,vectorSearch.spec.ts,extensionFieldsVectorSearch.spec.ts,schemaName.spec.ts, plus shared expectations helperWritten by Cursor Bugbot for commit 243fef6. This will update automatically on new commits. Configure here.