Secret_Joke_2262@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

How much more stupid is the 120B goliath Q3_K_M than the larger options?

6

1

How much more stupid is the 120B goliath Q3_K_M than the larger options?

Secret_Joke_2262@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

6

I want to download the goliath model but I can only afford Q3_K_M. It is written that it has high quality losses. How much quality loss is there?

I heard that the larger the model, the less it suffers intellectually when it is optimized. I usually use 70B Q5_K_M. Can I expect that 120B Q3_K_M will be significantly better than 70B Q5_K_M so that the time spent on downloading will be worth it?

https://preview.redd.it/1dvpq4bq8c0c1.png?width=1148&format=png&auto=webp&s=79588237d01a66643cfdb12cc13b84866df4bf68

Chat

quaquaversal_@alien.topB
link
fedilink
English
arrow-up
1·
1 year ago
What’s the tok/s for each of those models on that system?

Edit: also, if you don’t mind my asking, how much context are you able to use before inference degrades?
- Murky-Ladder8684@alien.topB
  link
  fedilink
  English
  arrow-up
  1·
  1 year ago
  for comparison sake EXL2 4.85bpw version runs around 6-8 t/s on 4x3090s at 8k context it’s the lower end.