The title, pretty much.

I’m wondering whether a 70b model quantized to 4bit would perform better than a 7b/13b/34b model at fp16. Would be great to get some insights from the community.

  • NachosforDachos@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    That was useful and interesting.

    Speaking of hypothetical situations how much money do you think an individual would need to buy the computing power needed to provide themselves with a gpt 4 turbo like experience locally?