batch tokenizer?

#106
by jjplane - opened

model_input = tokenizer(batch_messages, return_tensors="pt",padding=True, truncation=True).to("cuda")

rasie ValueError: Asking to pad but the tokenizer does not have a padding token. Please select a token to use as pad_token (tokenizer.pad_token = tokenizer.eos_token e.g.) or add a new pad token via tokenizer.add_special_tokens({'pad_token': '[PAD]'}).

Hi @jjplane
Can you try to set a pad token as suggested by the error message?

Hi @jjplane
Can you try to set a pad token as suggested by the error message?

ok, tokenizer.pad_token = tokenizer.eos_token worked, and <\s> is padded on the left. thx!

jjplane changed discussion status to closed

Sign up or log in to comment