ae_dataviz@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 2 years ago

Quantizing 70b models to 4-bit, how much does performance degrade?

1

Quantizing 70b models to 4-bit, how much does performance degrade?

ae_dataviz@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 2 years ago

The title, pretty much.

I’m wondering whether a 70b model quantized to 4bit would perform better than a 7b/13b/34b model at fp16. Would be great to get some insights from the community.

Chat

daHaus@alien.topB
link
fedilink
English
arrow-up
1·
2 years ago
This seems like something that would be difficult to predict considering how fundamental what your changing is. The method you use to quantize it and how refined it is also matters a great deal.