Skip to content

Conversation

@sanikolaev
Copy link
Collaborator

No description provided.

sanikolaev and others added 3 commits January 29, 2026 23:05
Implement Qwen local embedding model + tokenizer sanitization
Fix attention/weight loading quirks for Qwen weights
Update embeddings lib version to 1.1.1
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ sanikolaev
❌ donhardman
You have signed the CLA already but the status is still pending? Let us recheck it.

@github-actions
Copy link

github-actions bot commented Jan 29, 2026

Linux debug test results

  8 files    8 suites   13m 27s ⏱️
504 tests 482 ✅ 22 💤 0 ❌
518 runs  496 ✅ 22 💤 0 ❌

Results for commit 0a3b8d8.

♻️ This comment has been updated with latest results.

@github-actions
Copy link

github-actions bot commented Jan 29, 2026

Windows test results

  5 files    5 suites   18m 6s ⏱️
485 tests 470 ✅ 15 💤 0 ❌
493 runs  478 ✅ 15 💤 0 ❌

Results for commit 0a3b8d8.

♻️ This comment has been updated with latest results.

@github-actions
Copy link

github-actions bot commented Jan 29, 2026

Linux release test results

  8 files    8 suites   6m 29s ⏱️
504 tests 489 ✅ 15 💤 0 ❌
518 runs  503 ✅ 15 💤 0 ❌

Results for commit 0a3b8d8.

♻️ This comment has been updated with latest results.

sanikolaev and others added 3 commits January 30, 2026 00:23
- Remove QwenModel variant and custom implementation
- LocalModel now handles Qwen, Llama, Mistral, Gemma via auto-detection
- Add ModelArch enum for BERT vs causal architecture detection
- Implement CausalEmbeddingModel for supported architectures
- Consolidate embedding logic into unified LocalModel
- Remove redundant qwen.rs implementation
- Update create_model to route non-API models through LocalModel
- Add architecture detection and integration tests
@donhardman donhardman self-requested a review January 30, 2026 18:13
Copy link
Contributor

@donhardman donhardman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented the proper way. Now it should support Llama, Qwen, Mistral, and BERT models. It’s still good to check with Manticore together, but things are covered by tests and implemented in the right way now.

@github-actions
Copy link

clt

❌ CLT tests in test/clt-tests/mcl/
✅ OK: 14
❌ Failed: 1
⏳ Duration: 522s
👉 Check Action Results for commit d3205e4

Failed tests:

🔧 Edit failed tests in UI:

test/clt-tests/mcl/auto-embeddings-qwen.rec
––– input –––
rm -f /var/log/manticore/searchd.log; stdbuf -oL searchd --stopwait > /dev/null; stdbuf -oL searchd ${SEARCHD_ARGS:-} > /dev/null
––– output –––
OK
––– input –––
if timeout 10 grep -qm1 'accepting connections' <(tail -n 1000 -f /var/log/manticore/searchd.log); then echo 'Accepting connections!'; else echo 'Timeout or failed!'; fi
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_qwen (title TEXT, vec FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME='Qwen/Qwen3-Embedding-0.6B' FROM='title')"; echo $?
––– output –––
- 0
+ ERROR 1064 (42000) at line 1: error adding table 'test_qwen': prealloc: Failed to create an instance of the model
+ 1
––– input –––
mysql -h0 -P9306 -E -e "SHOW CREATE TABLE test_qwen"
––– output –––
- *************************** 1. row ***************************
+ ERROR 1064 (42000) at line 1: You have an error in your query. Please, double-check it.
-        Table: test_qwen
- Create Table: CREATE TABLE test_qwen (
- id bigint,
- title text,
- vec float_vector knn_type='hnsw' hnsw_similarity='L2' model_name='Qwen/Qwen3-Embedding-0.6B' FROM='title'
- )
––– input –––
mysql -h0 -P9306 -e "insert into test_qwen(id, title) values(1, 'book'),(2, 'bread');"; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "SELECT COUNT(*) as total_records FROM test_qwen"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "select id, title, knn_dist() from test_qwen where knn(vec, 3, 'loaf')"
––– output –––
- +------+-------+------------+
+ ERROR 1064 (42000) at line 1: table test_qwen: requested KNN search attribute 'vec' not found
- | id   | title | knn_dist() |
- +------+-------+------------+
- |    2 | bread | #!/0\.111[0-9]*/!# |
- |    1 | book  | #!/0\.118[0-9]*/!# |
- +------+-------+------------+

- Fix weight prefix remapping for Qwen3-Embedding models
- Correct tensor dtype handling in embedding computation
- Enable tests to run instead of skipping on load failure
- Update candle dependencies to version 0.9.2
- Align hf-hub revision for compatibility
- Use manticoresoftware candle fork with clear_kv_cache()
- Explicitly clear cache to prevent stale state between inferences
- Add test_cache_path helper using CARGO_MANIFEST_DIR
- Replace hardcoded paths across all test cases
- Ensure consistent and portable cache directory handling
- Downgrade hf-hub to 0.3.2
- Downgrade dirs, dirs-sys, redox_users
- Align ureq HTTP client dependencies
- Add windows-sys 0.48.x targets
@github-actions
Copy link

clt

❌ CLT tests in test/clt-tests/mcl/
✅ OK: 14
❌ Failed: 1
⏳ Duration: 498s
👉 Check Action Results for commit cf2a147

Failed tests:

🔧 Edit failed tests in UI:

test/clt-tests/mcl/auto-embeddings-qwen.rec
––– input –––
rm -f /var/log/manticore/searchd.log; stdbuf -oL searchd --stopwait > /dev/null; stdbuf -oL searchd ${SEARCHD_ARGS:-} > /dev/null
––– output –––
OK
––– input –––
if timeout 10 grep -qm1 'accepting connections' <(tail -n 1000 -f /var/log/manticore/searchd.log); then echo 'Accepting connections!'; else echo 'Timeout or failed!'; fi
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_qwen (title TEXT, vec FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME='Qwen/Qwen3-Embedding-0.6B' FROM='title')"; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P9306 -E -e "SHOW CREATE TABLE test_qwen"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "insert into test_qwen(id, title) values(1, 'book'),(2, 'bread');"; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "SELECT COUNT(*) as total_records FROM test_qwen"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "select id, title, knn_dist() from test_qwen where knn(vec, 3, 'loaf')"
––– output –––
+------+-------+------------+
| id   | title | knn_dist() |
+------+-------+------------+
- |    2 | bread | #!/0\.111[0-9]*/!# |
+ |    2 | bread | 0.31665143 |
- |    1 | book  | #!/0\.118[0-9]*/!# |
+ |    1 | book  | 0.50007039 |
+------+-------+------------+

- Expand CausalEmbeddingKind enum for Qwen, Llama, Mistral, Gemma
- Extend model type detection for gemma2 and gemma3 variants
- Support both torch_dtype and dtype config fields for tensor type
- Add integration tests for TinyLlama, TinyMistral, and Gemma models
- Add integration tests for embedding models
- Cover loading and initialization paths
- Test encoding functionality with various inputs
- Verify output consistency and format
…fork

- Pin candle-core, candle-nn, candle-transformers to specific git revision
- Move test helper functions to dedicated test module
- Minor loop variable refactor for clarity
@github-actions
Copy link

clt

❌ CLT tests in test/clt-tests/mcl/
✅ OK: 14
❌ Failed: 1
⏳ Duration: 512s
👉 Check Action Results for commit 448837c

Failed tests:

🔧 Edit failed tests in UI:

test/clt-tests/mcl/auto-embeddings-qwen.rec
––– input –––
rm -f /var/log/manticore/searchd.log; stdbuf -oL searchd --stopwait > /dev/null; stdbuf -oL searchd ${SEARCHD_ARGS:-} > /dev/null
––– output –––
OK
––– input –––
if timeout 10 grep -qm1 'accepting connections' <(tail -n 1000 -f /var/log/manticore/searchd.log); then echo 'Accepting connections!'; else echo 'Timeout or failed!'; fi
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_qwen (title TEXT, vec FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME='Qwen/Qwen3-Embedding-0.6B' FROM='title')"; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P9306 -E -e "SHOW CREATE TABLE test_qwen"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "insert into test_qwen(id, title) values(1, 'book'),(2, 'bread');"; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "SELECT COUNT(*) as total_records FROM test_qwen"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "select id, title, knn_dist() from test_qwen where knn(vec, 3, 'loaf')"
––– output –––
+------+-------+------------+
| id   | title | knn_dist() |
+------+-------+------------+
- |    2 | bread | #!/0\.111[0-9]*/!# |
+ |    2 | bread | 0.31665143 |
- |    1 | book  | #!/0\.118[0-9]*/!# |
+ |    1 | book  | 0.50007039 |
+------+-------+------------+

@donhardman
Copy link
Contributor

Also those model supported:

Locutusque/TinyMistral-248M-v2
TinyLlama/TinyLlama-1.1B-Chat-v1.0
h2oai/embeddinggemma-300m

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants