Batch Inference

#7
by dfrank - opened

Hi I've been having some problems doing batch inference. The only thing that seems to work is using tokenizer.padding_side = "right" but the results I get are inconsistent with respect to a single inference (by a lot). Any advice?

Google org

Hi @dfrank , This is because padding the inputs to the right in batch inference can result in different token sequences than padding to the left during single inference. Make sure your model is set to evaluation mode (e.g., model.eval() in PyTorch). This will disable any dropout layers, which could introduce variability in your outputs. if you have any concerns let us know. Thank you

Google org

@dfrank , I hope you got the clarification. please confirm if you have any concerns. Thank you.

Thanks @lvk , I had indeed forgotten to set the model to evaluation. However, even though I have already done it, I still can't get consistent result between batch and single inference. I also find it strange to have to set the padding size to the right. When I set the padding to the left (which conceptually I think makes the most sense) I only get NaN in my logits.

Sign up or log in to comment