Skip to content

Opensearch Vectordb changes#281

Open
manju956 wants to merge 16 commits intoIBM:mainfrom
manju956:opensearch
Open

Opensearch Vectordb changes#281
manju956 wants to merge 16 commits intoIBM:mainfrom
manju956:opensearch

Conversation

@manju956
Copy link
Contributor

@manju956 manju956 commented Feb 3, 2026

  • Introduce Opensearch Vectordb as a replacement for Milvus
  • Contains manifest file changes replacing references of milvus with opensearch. Additionally, opensearch uses user authentication for db interactions
  • Opensearch relavant changes in python backend code

@manju956 manju956 self-assigned this Feb 3, 2026
@manju956 manju956 added the enhancement New feature or request label Feb 3, 2026
@manju956 manju956 changed the title Opensearch Opensearch Vectordb changes Feb 3, 2026
@manju956 manju956 requested review from dharaneeshvrd, iv1111, mkumatag and yussufsh and removed request for dharaneeshvrd, iv1111 and mkumatag February 3, 2026 12:45
"description": "Post-processor for hybrid search using RRF",
"phase_results_processors": [
{
"normalization-processor": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems normalization-processor & rrf are different techniques
You have used normalization-processor but only in id you have mentioned as rrf
For our use case normalization-processor is better suitable it seems.
But adding weights is critical

Can you please revisit this block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exploring semantic heavy weights, will run tests for accurancy verification

@Niharika0306
Copy link
Contributor

a small comment - to handle the db-status call on opensearch. Without this, the UI fails to display response.

curl http://localhost:5001/db-status
{"message":"Empty value passed for a required argument 'index'.","ready":false}

Signed-off-by: manju956 <manjunath.ac956@gmail.com>
Signed-off-by: manju956 <manjunath.ac956@gmail.com>
Signed-off-by: manju956 <manjunath.ac956@gmail.com>
Signed-off-by: manju956 <manjunath.ac956@gmail.com>
Signed-off-by: manju956 <manjunath.ac956@gmail.com>
Signed-off-by: manju956 <manjunath.ac956@gmail.com>
Signed-off-by: manju956 <manjunath.ac956@gmail.com>
Signed-off-by: manju956 <manjunath.ac956@gmail.com>
Signed-off-by: manju956 <manjunath.ac956@gmail.com>
Signed-off-by: manju956 <manjunath.ac956@gmail.com>
Signed-off-by: manju956 <manjunath.ac956@gmail.com>
Signed-off-by: manju956 <manjunath.ac956@gmail.com>
Signed-off-by: manju956 <manjunath.ac956@gmail.com>
@manju956
Copy link
Contributor Author

manju956 commented Feb 6, 2026

a small comment - to handle the db-status call on opensearch. Without this, the UI fails to display response.

curl http://localhost:5001/db-status
{"message":"Empty value passed for a required argument 'index'.","ready":false}

with latest commit in the PR, the issue is fixed.

Copy link
Member

@yussufsh yussufsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow the OpenSearch trademark.

Signed-off-by: manju956 <manjunath.ac956@gmail.com>
Copy link
Member

@dharaneeshvrd dharaneeshvrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you bump the rag version as well?

}

try:
self.client.search_pipeline.delete(id="hybrid_rrf_pipeline")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we recreating here?
why can't we just check whether it exists or not?
also don't use rrf in id

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will check this

manju956 and others added 2 commits February 6, 2026 21:20
Signed-off-by: manju956 <manjunath.ac956@gmail.com>
from abc import ABC, abstractmethod
from typing import List, Dict, Optional

class VectorStore(ABC):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add details to these methods as much as possible


class VectorStore(ABC):
@abstractmethod
def insert_chunks(self, emb_model: str, emb_endpoint: str, max_tokens: int, chunks: List[Dict], batch_size: int = 10):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we take the embedding out of this class so that we can use this class more efficiently? may be send a embedder class which has a method containing all the logic for embedding, I'm looking at this class should support 2 ways of searching/inserting 1. pure embedding 2. send text chunks(with embedding class)


class VectorStoreNotReadyError():
"""Raised when the database is unreachable or initializing."""
pass No newline at end of file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add newline in the end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants