I ordered more hardware to quantize this. Just wait 2-3 weeks and I will make it, it's just ... 2TB can fit on my 2tb ssd, but where am I suppose to put the quants?

#5
by RichardErkhov - opened

So just wait ok, I will make it, unless it's not supported by llama.cpp

hero!

@mlabonne how long it took for you to merge? I noticed mergekit is singlethreaded, I want to know how long to expect it to run for 680B and 1T model. As you noticed I sometimes do crazy stuff, so I want to do something interesting haha. If you have any crazy ideas just let me know

I don't remember but that wasn't insane, like a few hours. Uploading it took a lot more time and attempts.

well uploading is the easy part for me. I guess Im just not patient enough for the processing, plus I don't have space, so I need to really mess with the code to first go through the model, convert it in memory just to count variables, then go through it again and process and upload splits one by one. I guess I will wait more. What size should I make it lol?

Sign up or log in to comment