OSError: LoneStriker/Mixtral_34Bx2_MoE_60B-2.4bpw-h6-exl2 does not appear to have a file named model-00001-of-00013.safetensors.

#2
by Marseus - opened

Hello. I'm new here
I downloaded the original Mixtral in cloudyu/Mixtral_34Bx2_MoE_60B before, but my vram is not enough(4090 24G).
so i was trying to use this model but it pop out this error message.

Here's the code i use:

Mixtral_model_name = "LoneStriker/Mixtral_34Bx2_MoE_60B-2.4bpw-h6-exl2"
model = AutoModelForCausalLM.from_pretrained(Mixtral_model_name,
torch_dtype=torch.float32,
device_map='cuda',
local_files_only=False,
# load_in_4bit=True
)

How can i fix this?
And how many vram will this model cost?
Thanks!

Use either oobabooga's text-generation-webui or exui from Github to load this model. For ooba, use the exllamav2 loader.

@LoneStriker Thansk! oobabooga's text-generation-webui works, but still out of vram.
How many vram does this model cost? It seems >24G
θž’εΉ•ζ“·ε–η•«ι’ 2024-01-15 161323.jpg

It should fit in 24 GB VRAM, you probably need to reduce the max tokens in ooba, here is the model size for 2.4 and 2.65:

18G     Mixtral_34Bx2_MoE_60B-2.4bpw-h6-exl2
20G     Mixtral_34Bx2_MoE_60B-2.65bpw-h6-exl2

Drop your max tokens to 2048 to see if it loads.

Sign up or log in to comment