-
Notifications
You must be signed in to change notification settings - Fork 18
Fixes for the qwen support #130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This reverts commit 974bd51.
Implement Qwen local embedding model + tokenizer sanitization Fix attention/weight loading quirks for Qwen weights Update embeddings lib version to 1.1.1
|
|
Linux debug test results 8 files 8 suites 13m 27s ⏱️ Results for commit 0a3b8d8. ♻️ This comment has been updated with latest results. |
Windows test results 5 files 5 suites 18m 6s ⏱️ Results for commit 0a3b8d8. ♻️ This comment has been updated with latest results. |
Linux release test results 8 files 8 suites 6m 29s ⏱️ Results for commit 0a3b8d8. ♻️ This comment has been updated with latest results. |
- Remove QwenModel variant and custom implementation - LocalModel now handles Qwen, Llama, Mistral, Gemma via auto-detection - Add ModelArch enum for BERT vs causal architecture detection - Implement CausalEmbeddingModel for supported architectures - Consolidate embedding logic into unified LocalModel - Remove redundant qwen.rs implementation - Update create_model to route non-API models through LocalModel - Add architecture detection and integration tests
donhardman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented the proper way. Now it should support Llama, Qwen, Mistral, and BERT models. It’s still good to check with Manticore together, but things are covered by tests and implemented in the right way now.
clt❌ CLT tests in Failed tests:🔧 Edit failed tests in UI: test/clt-tests/mcl/auto-embeddings-qwen.rec––– input –––
rm -f /var/log/manticore/searchd.log; stdbuf -oL searchd --stopwait > /dev/null; stdbuf -oL searchd ${SEARCHD_ARGS:-} > /dev/null
––– output –––
OK
––– input –––
if timeout 10 grep -qm1 'accepting connections' <(tail -n 1000 -f /var/log/manticore/searchd.log); then echo 'Accepting connections!'; else echo 'Timeout or failed!'; fi
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_qwen (title TEXT, vec FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME='Qwen/Qwen3-Embedding-0.6B' FROM='title')"; echo $?
––– output –––
- 0
+ ERROR 1064 (42000) at line 1: error adding table 'test_qwen': prealloc: Failed to create an instance of the model
+ 1
––– input –––
mysql -h0 -P9306 -E -e "SHOW CREATE TABLE test_qwen"
––– output –––
- *************************** 1. row ***************************
+ ERROR 1064 (42000) at line 1: You have an error in your query. Please, double-check it.
- Table: test_qwen
- Create Table: CREATE TABLE test_qwen (
- id bigint,
- title text,
- vec float_vector knn_type='hnsw' hnsw_similarity='L2' model_name='Qwen/Qwen3-Embedding-0.6B' FROM='title'
- )
––– input –––
mysql -h0 -P9306 -e "insert into test_qwen(id, title) values(1, 'book'),(2, 'bread');"; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "SELECT COUNT(*) as total_records FROM test_qwen"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "select id, title, knn_dist() from test_qwen where knn(vec, 3, 'loaf')"
––– output –––
- +------+-------+------------+
+ ERROR 1064 (42000) at line 1: table test_qwen: requested KNN search attribute 'vec' not found
- | id | title | knn_dist() |
- +------+-------+------------+
- | 2 | bread | #!/0\.111[0-9]*/!# |
- | 1 | book | #!/0\.118[0-9]*/!# |
- +------+-------+------------+ |
- Fix weight prefix remapping for Qwen3-Embedding models - Correct tensor dtype handling in embedding computation - Enable tests to run instead of skipping on load failure - Update candle dependencies to version 0.9.2 - Align hf-hub revision for compatibility
- Use manticoresoftware candle fork with clear_kv_cache() - Explicitly clear cache to prevent stale state between inferences
- Add test_cache_path helper using CARGO_MANIFEST_DIR - Replace hardcoded paths across all test cases - Ensure consistent and portable cache directory handling
- Downgrade hf-hub to 0.3.2 - Downgrade dirs, dirs-sys, redox_users - Align ureq HTTP client dependencies - Add windows-sys 0.48.x targets
clt❌ CLT tests in Failed tests:🔧 Edit failed tests in UI: test/clt-tests/mcl/auto-embeddings-qwen.rec––– input –––
rm -f /var/log/manticore/searchd.log; stdbuf -oL searchd --stopwait > /dev/null; stdbuf -oL searchd ${SEARCHD_ARGS:-} > /dev/null
––– output –––
OK
––– input –––
if timeout 10 grep -qm1 'accepting connections' <(tail -n 1000 -f /var/log/manticore/searchd.log); then echo 'Accepting connections!'; else echo 'Timeout or failed!'; fi
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_qwen (title TEXT, vec FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME='Qwen/Qwen3-Embedding-0.6B' FROM='title')"; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P9306 -E -e "SHOW CREATE TABLE test_qwen"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "insert into test_qwen(id, title) values(1, 'book'),(2, 'bread');"; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "SELECT COUNT(*) as total_records FROM test_qwen"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "select id, title, knn_dist() from test_qwen where knn(vec, 3, 'loaf')"
––– output –––
+------+-------+------------+
| id | title | knn_dist() |
+------+-------+------------+
- | 2 | bread | #!/0\.111[0-9]*/!# |
+ | 2 | bread | 0.31665143 |
- | 1 | book | #!/0\.118[0-9]*/!# |
+ | 1 | book | 0.50007039 |
+------+-------+------------+ |
- Expand CausalEmbeddingKind enum for Qwen, Llama, Mistral, Gemma - Extend model type detection for gemma2 and gemma3 variants - Support both torch_dtype and dtype config fields for tensor type - Add integration tests for TinyLlama, TinyMistral, and Gemma models
- Add integration tests for embedding models - Cover loading and initialization paths - Test encoding functionality with various inputs - Verify output consistency and format
…fork - Pin candle-core, candle-nn, candle-transformers to specific git revision - Move test helper functions to dedicated test module - Minor loop variable refactor for clarity
clt❌ CLT tests in Failed tests:🔧 Edit failed tests in UI: test/clt-tests/mcl/auto-embeddings-qwen.rec––– input –––
rm -f /var/log/manticore/searchd.log; stdbuf -oL searchd --stopwait > /dev/null; stdbuf -oL searchd ${SEARCHD_ARGS:-} > /dev/null
––– output –––
OK
––– input –––
if timeout 10 grep -qm1 'accepting connections' <(tail -n 1000 -f /var/log/manticore/searchd.log); then echo 'Accepting connections!'; else echo 'Timeout or failed!'; fi
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_qwen (title TEXT, vec FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME='Qwen/Qwen3-Embedding-0.6B' FROM='title')"; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P9306 -E -e "SHOW CREATE TABLE test_qwen"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "insert into test_qwen(id, title) values(1, 'book'),(2, 'bread');"; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "SELECT COUNT(*) as total_records FROM test_qwen"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "select id, title, knn_dist() from test_qwen where knn(vec, 3, 'loaf')"
––– output –––
+------+-------+------------+
| id | title | knn_dist() |
+------+-------+------------+
- | 2 | bread | #!/0\.111[0-9]*/!# |
+ | 2 | bread | 0.31665143 |
- | 1 | book | #!/0\.118[0-9]*/!# |
+ | 1 | book | 0.50007039 |
+------+-------+------------+ |
|
Also those model supported: |
No description provided.