ae_dataviz@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 2 years ago

Quantizing 70b models to 4-bit, how much does performance degrade?

1

Quantizing 70b models to 4-bit, how much does performance degrade?

ae_dataviz@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 2 years ago

The title, pretty much.

I’m wondering whether a 70b model quantized to 4bit would perform better than a 7b/13b/34b model at fp16. Would be great to get some insights from the community.

Chat

Dry-Vermicelli-682@alien.topB
link
fedilink
English
arrow-up
1·
2 years ago
So anyone wanting to play around with this at home, has to expect to drop about 4K or so for GPUs and a setup?
- drifter_VR@alien.topB
  link
  fedilink
  English
  arrow-up
  1·
  2 years ago
  I can get 2 3090 for 1200€ here on the second-hand market