46G goliath-120b.Q2_K
So the smallest one I found (I didn’t quantize this one myself, found it on HF somewhere)
And it was very slow. about 13t/s prompt_eval and then 2.5t/s generating text, so only really useful for me when I need to run it on my laptop (I get like 15t/s with 120b model on my 2x3090 rig at 3bpw exl2)
As for the models it’s self, I like it a lot and use it frequently.
TBH, this ram thing is more helpful for me because it lets me run Q5 70b models instead of just Q4 now.
I wish Mozilla would just stick to Firefox, and invest the rest of the money into some dividend paying fund, so they aren’t so reliant on Google for funding for their software engineers.