vram required figures are wrong

#1
by ramzeez88 - opened

this model needs a lot more vram than it's stated in the model card.

32K context - n_ctx 32848 = 6247.16 MiB, 4K context - n_ctx 4096 = 3626.91 MiB

@ramzeez it should be pretty accurate. How much vram is it using

I must admit, when I posted my comment I didn't get the data exactly from this model, my appologize, the truth is that specifically in this case it is different >>>

llama_new_context_with_model: total VRAM used: 38378.45 MiB (model: 10055.54 MiB, context: 28322.91 MiB) [Q_6_K TheBloke quant]

Sign up or log in to comment