Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 28 additions & 26 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ all = [
# We kindof don't want users to install them.
"torch (>=2.7.1,<3.0.0)",
"sentence-transformers (>=4.1.0,<5.0.0)",
"qdrant-client (>=1.14.2,<2.0.0)",
"qdrant-client (>=1.16.0,<2.0.0)",
"volcengine-python-sdk (>=4.0.4,<5.0.0)",
"nltk (>=3.9.1,<4.0.0)",
"rake-nltk (>=1.0.6,<1.1.0)",
Expand Down
25 changes: 19 additions & 6 deletions src/memos/chunkers/sentence_chunker.py
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think can not make direct changes package version , because our production environment uses chonkie >=1.0.7 and chonkie <1.4.0 versions. Changing the code and version requirements means the old environment will break , causing production system errors.
My suggestion is to add some version compatibility , check the chonkie version first and then add appropriate parameters.
Although this makes the code redundant, the system needs a transitional version.

我觉得这里不可以直接进行更改,因为我们内部的线上环境是使用的chonkie >=1.0.7 and chonkie <1.4.0版本,代码以及版本更改意味着老的环境会立即失效导致线上系统报错
我的建议是做一些版本兼容,或者判断一下版本然后传参数,虽然这样代码很冗余,但系统需要有个过渡版本

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified as suggested with try-except for version compatibility:
Try new API (v1.4.0+) first, fallback to legacy API on failure
Reverted pyproject.toml chonkie version to >=1.0.7,<2.0.0 for production compatibility
Code pushed, please review 🙏

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice
but code check error
image
just Run poetry lock to fix the lock file before commit code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done~

Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,25 @@ def __init__(self, config: SentenceChunkerConfig):
from chonkie import SentenceChunker as ChonkieSentenceChunker

self.config = config
self.chunker = ChonkieSentenceChunker(
tokenizer_or_token_counter=config.tokenizer_or_token_counter,
chunk_size=config.chunk_size,
chunk_overlap=config.chunk_overlap,
min_sentences_per_chunk=config.min_sentences_per_chunk,
)

# Try new API first (v1.4.0+)
try:
self.chunker = ChonkieSentenceChunker(
tokenizer=config.tokenizer_or_token_counter,
chunk_size=config.chunk_size,
chunk_overlap=config.chunk_overlap,
min_sentences_per_chunk=config.min_sentences_per_chunk,
)
except (TypeError, AttributeError) as e:
# Fallback to old API (<v1.4.0)
logger.debug(f"Falling back to old chonkie API: {e}")
self.chunker = ChonkieSentenceChunker(
tokenizer_or_token_counter=config.tokenizer_or_token_counter,
chunk_size=config.chunk_size,
chunk_overlap=config.chunk_overlap,
min_sentences_per_chunk=config.min_sentences_per_chunk,
)

logger.info(f"Initialized SentenceChunker with config: {config}")

def chunk(self, text: str) -> list[str] | list[Chunk]:
Expand Down
4 changes: 2 additions & 2 deletions src/memos/mem_os/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
import os
import time

from datetime import datetime
from datetime import datetime, timezone
from pathlib import Path
from threading import Lock
from typing import Any, Literal
Expand Down Expand Up @@ -192,7 +192,7 @@ def _register_chat_history(
self.chat_history_manager[user_id] = ChatHistory(
user_id=user_id if user_id is not None else self.user_id,
session_id=session_id if session_id is not None else self.session_id,
created_at=datetime.utcnow(),
created_at=datetime.now(timezone.utc),
total_messages=0,
chat_history=[],
)
Expand Down
6 changes: 3 additions & 3 deletions src/memos/vec_dbs/qdrant.py
Original file line number Diff line number Diff line change
Expand Up @@ -138,14 +138,14 @@ def search(
List of search results with distance scores and payloads.
"""
qdrant_filter = self._dict_to_filter(filter) if filter else None
response = self.client.search(
response = self.client.query_points(
collection_name=self.config.collection_name,
query_vector=query_vector,
query=query_vector,
limit=top_k,
query_filter=qdrant_filter,
with_vectors=True,
with_payload=True,
)
).points
logger.info(f"Qdrant search completed with {len(response)} results.")
return [
VecDBItem(
Expand Down
Loading