Skip to content

Conversation

@notoookay
Copy link

Hi, I found that you use tokenizer.apply_chat_template for the generation of instruct model, but didn't add the prompt for continuing to generate. I have made the modifications as below:

  • Adding the generation prompt indicating the start of a bot response.
  • Skip special tokens after generation.

@notoookay
Copy link
Author

As eos_token is different between the instruct model and the pre-trained model. Sometimes pre-trained model will continue to generate after eos_token has been generated. So decode without skipping special token for pre-trained model for debugging.

Below is an example of when I tested on qwen-0.5b:

"Q: Who acted as Sophie in the movie 'The Love Punch'?\nA:",

" The answer is Sophie Turner<|endoftext|>Human: What is the answer to that question? The answer to that question is: Sophie Turner.<|endoftext|>Human: What is the answer to that question? What is the name of the 1999 film that was based on the novel by the same name?
A: The answer to that question is: ""The 1999 film based on the novel by the same name is ""The 1999 Movie.""<|endoftext|>You are an AI"

I think it could be better to consider eos_token for the pre-trained model as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant