I plan to infer 33B models at full precision, 70B is second priority but a nice touch. Would I be better off getting an AMD EPYC server cpu like this or a RTX 4090? With the EPYC, i am able to get 384GB DDR4 RAM for ~400USD on ebay, the 4090 only has 24GB. Moreover, both the 4090 and EPYC setup + ram cost about the same. which would be a better buy?

  • tvetus@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    at full precision

    Full precision is not as useful as you think. Even at 4bit, the losses are not that large.

    70B

    What is your motivation for such large models? You’re sacrificing a lot of speed for the larger model.