chat : add parsing for solar-open-100b #18540

aldehir · 2026-01-02T08:20:18Z

Chat parser for Solar-Open-100B.

Features

Reasoning parsing
Reasoning injection via reasoning_content field for interleaved thinking
response_format parsing
Tool call parsing, including tool_choice = required and reasoning

The following variables can be modified via chat template kwargs:

default_system_prompt: bool = true - Include default system prompt
reasoning_effort: "minimal" | "low" | "medium" | "high" = "high" - Set reasoning effort. When set to low or minimal, reasoning is disabled.
think_render_option: "all" | "lastthink" = "lastthink" - Determines when to render reasoning traces when fed back for interleaved rendering. The default (lastthink) only includes reasoning after the last user message. The all option includes reasoning for all assistant messages.

HelloKS · 2026-01-02T14:17:49Z

Hello, Thanks for the PR! I was waiting for this.

I tried with reasoning_effort = low, and it seems doesn't like it for some reason.

slot process_toke: id  3 | task 0 | n_decoded = 1, n_remaining = -1, next token:    22 '<|think|>'
srv  update_slots: run slots completed
que    start_loop: waiting for new tasks
que    start_loop: processing new tasks
que    start_loop: processing task, id = 1
que    start_loop: update slots
srv  update_slots: posting NEXT_RESPONSE
que          post: new task, id = 2, front = 0
slot update_slots: id  3 | task 0 | slot decode token, n_ctx = 32000, n_tokens = 77, truncated = 0
srv  update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
srv  update_chat_: Parsing chat message: <|think|>
Parsing input with format peg-native: <|think|>
res  remove_waiti: remove task 0 from waiting list. current waiting = 1 (before remove)
srv          stop: cancel task, id_task = 0
res  remove_waiti: remove task 0 from waiting list. current waiting = 0 (before remove)
que          post: new task, id = 3/1, front = 1
srv          stop: all tasks already finished, no need to cancel
srv    operator(): got exception: {"error":{"code":500,"message":"Failed to parse input at pos 0","type":"server_error"}}
srv  log_server_r: request: POST /v1/chat/completions 127.0.0.1 500
srv  log_server_r: request:  {"model":"model-Q4_K_M.gguf","temperature":0.8,"top_p":0.95,"top_k":50,"chat_template_kwargs":{"reasoning_effort":"low"},"messages":[{"role":"user","content":"Who are yo
u?"}],"stream":true,"stream_options":{"include_usage":true}}
srv  log_server_r: response: {"error":{"code":500,"message":"Failed to parse input at pos 0","type":"server_error"}}

Tool calling, chat with reasoning works well.

aldehir · 2026-01-02T14:21:07Z

@HelloKS, thanks for that info. I should have done more thorough testing with low and minimal. The parsing changes, because it's not a continuation of <|begin|>assistant anymore. I'll just make the parsing a bit more lax.

aldehir · 2026-01-02T14:33:53Z

@HelloKS Give it a try with ~~980c772~~ c41d8f8

~~Looks like low/minimal adds an empty think section, but the model can still resume with a thought? Interesting. This should make it more permissive though.~~

I didn't realize it always appends <|begin|>assistant even after adding an empty think section.

HelloKS · 2026-01-02T15:12:50Z

Yes, It now works without reasoning (even with tooling!)

HelloKS · 2026-01-02T15:21:40Z

Interesting, model sometimes reasoning even with "reasoning_effort": "low"

Translate this sentence into English:
…本当に、馬鹿馬鹿しい

다음 문장을 한국어로 설명 없이 번역하세요:
…本当に、馬鹿馬鹿しい

(= Translate this sentence into Korean without any explanation)

Not sure this behavior is from trained, or template though.

aldehir · 2026-01-02T15:24:47Z

Have you tried minimal instead of low? I believe this is the training. When it's either of those, it adds the following to the template:

<|begin|>assistant<|think|><|end|><|begin|>assistant

Any additional <|think|> tags generated would likely be from the training or maybe quantization? Not sure.

HelloKS · 2026-01-02T15:27:18Z

Have you tried minimal instead of low? I believe this is the training. When it's either of those, it adds the following to the template:
<|begin|>assistant<|think|><|end|><|begin|>assistant
Any additional <|think|> tags generated would likely be from the training or maybe quantization? Not sure.

Minimal and low does the same behavior. I think it's ok because they didn't document this "reasoning off" feature. Maybe it was planned leftover, who will know lol

LETS-BEE · 2026-01-29T05:28:23Z

any progress?

HelloKS · 2026-01-29T09:02:07Z

any progress?

It works perfectly (I'm using it locally), but just PR progress is stalled. Maybe related to #18675 ?

pwilkin

Oh, sorry. Yeah, let's merge it.

pwilkin · 2026-01-29T12:44:26Z

@0cc4m @jeffbolznv Just FYI getting this test failure on CI:

FLASH_ATTN_EXT(hsk=128,hsv=128,nh=4,nr23=[12,1],kv=512,nb=35,mask=1,sinks=0,max_bias=8.000000,logit_softcap=10.000000,prec=def,type_KV=f32,permute=[0,1,2,3])

0cc4m · 2026-01-29T12:46:02Z

Yeah, I'm aware of it, it only showed up after the merge of #19075, but not on the branch itself. I'll look into it.

ggerganov · 2026-01-29T12:47:48Z

@0cc4m I think the reason is because in #19115 we added some new FA tests to exercise non-power-of-2 number of heads. I think these new tests are the one that are failing, so it is not a regression - just something that has been revealed by the new set of tests.

aldehir · 2026-01-29T17:16:30Z

@pwilkin thank you!

* chat : add parsing for solar-open-100b * add comments to rules * cont : make assistant start optional * cont : remove assistant start prefix altogether --------- Co-authored-by: Piotr Wilkin (ilintar) <piotr.wilkin@syndatis.com>

chat : add parsing for solar-open-100b

8282ef6

aldehir mentioned this pull request Jan 2, 2026

model: add Solar Open model #18511

Merged

1 task

add comments to rules

92aad3d

github-actions bot added the testing Everything test related label Jan 2, 2026

loci-dev mentioned this pull request Jan 2, 2026

UPSTREAM PR #18540: chat : add parsing for solar-open-100b auroralabs-loci/llama.cpp#786

Open

aldehir marked this pull request as ready for review January 2, 2026 09:49

aldehir requested review from ggerganov and pwilkin as code owners January 2, 2026 09:49

cont : make assistant start optional

980c772

cont : remove assistant start prefix altogether

c41d8f8

pwilkin approved these changes Jan 29, 2026

View reviewed changes

Merge branch 'master' into solar-open

c0bffa2

pwilkin merged commit 7b7ae85 into ggml-org:master Jan 29, 2026
77 of 78 checks passed

chat : add parsing for solar-open-100b #18540

chat : add parsing for solar-open-100b #18540

Conversation

aldehir commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HelloKS commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aldehir commented Jan 2, 2026

Uh oh!

aldehir commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HelloKS commented Jan 2, 2026

Uh oh!

HelloKS commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aldehir commented Jan 2, 2026

Uh oh!

HelloKS commented Jan 2, 2026

Uh oh!

LETS-BEE commented Jan 29, 2026

Uh oh!

HelloKS commented Jan 29, 2026

Uh oh!

pwilkin left a comment

Choose a reason for hiding this comment

Uh oh!

pwilkin commented Jan 29, 2026

Uh oh!

0cc4m commented Jan 29, 2026

Uh oh!

ggerganov commented Jan 29, 2026

Uh oh!

Uh oh!

aldehir commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

aldehir commented Jan 2, 2026 •

edited

Loading

HelloKS commented Jan 2, 2026 •

edited

Loading

aldehir commented Jan 2, 2026 •

edited

Loading

HelloKS commented Jan 2, 2026 •

edited

Loading