Tokenizer 'apply_chat_template' issue

#42
by Ksgk-fy - opened

Hello, thanks for releasing such a great model!
I have encountered an issue with the tokenizer for Llama-3.1-8B-Instruct model, when I use apply_chat_template, I get
'<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nHI!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nhello ;><|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n'
While for Llama-3-8B-Instruct, I get
'<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nHI!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nhello ;><|eot_id|>'

Can you explain why such different exists? Is the 3.1 ver. using the chat_template to incur another function call or some sort of self-reflection here?

Sign up or log in to comment