Skip to content

Improve chunking for /rerank when history is too long #149

@lpiwowar

Description

@lpiwowar
          This is fine for now, I think:). The better place would be in my opinion here [1]. The reason for this is:
  • We can calculate how to chunk the results from the vector db so that they fit the context of the re-rank model. If the chunks would be too small, we could proceed to chunking of the history.

  • We can send bigger chunk to the embedding model. With this change, we are truncating the input for the embedding model as well. This might not always be necessary, as the rerank model has to accommodate both for the vector database results and the history. In contrast with the embedding model which has to accommodate only for the history. But this one is up for discussion I guess:). I see some downsides with this as well.

Anyway, this is just me thinking aloud here. We can improve later. This fixes the issue for now.

[1]

max_chunk_size = config.reranking_model_max_context // 2

Originally posted by @lpiwowar in #143 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions