is it ggufable?

#1
by sopack - opened

since most us mortals don't have huge VRAMs, it'd be cool to gguf this model as well.

Cognitive Computations org

I wanna see quip sharp version

https://github.com/Cornell-RelaxML/quip-sharp

I like the idea of quip# because it would be so tiny that it would work cpu even without ggml, but from what I remember from the paper is that the 2 bit quantization works less well for smaller models. Might still be beter than gptq 3 bit but I think we get more performance with 5-6 bit gguf version.
Note that the bloke just published a gguf version :)

ehartford changed discussion status to closed

Sign up or log in to comment