Just wondering if anyone with more knowledge on server hardware could point me in the direction of getting an 8 channel ddr4 server up and running (Estimated bandwidth speed is around 200gb/s) So I would think it would be plenty for inferencing LLM’s.
I would prefer to go used Server hardware due to price, when comparing the memory amount to getting a bunch of p40’s the power consumption is drastically lower. Im just not sure how fast a slightly older server cpu can process inferencing.

If I was looking to run 80-120gb models would 200gb/s and dual 24 core cpu’s get me 3-5 tokens a second?

  • FaustBargain@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    my setup

    EPYC Milan-X 7473X 24-Core 2.8GHz 768MB L3

    512GB of HMAA8GR7AJR4N-XN HYNIX 64GB (1X64GB) 2RX4 PC4-3200AA DDR4-3200MHz ECC RDIMMs

    MZ32-AR0 Rev 3.0 motherboard

    6x 20tb WD Red Pros on ZFS with zstd compression

    SABRENT Gaming SSD Rocket 4 Plus-G with Heatsink 2TB PCIe Gen 4 NVMe M.2 2280

    you can probably get away with a non-x without really an performance difference. it might make a difference in very tiny models, but that’s not the point of getting such a beastly machine.

    I got the Milan-X because I also use it for cad, and circuit board development, and gaming, and video editing so it’s an all in one for me.

    also my electric bill went from $40 a month to $228 a month, but some of that is because I haven’t setup the suspend states yet and the machine isn’t sleeping the way I want it to yet. I just haven’t gotten around to it. i imagine it would cut the bill in half, and then maybe choosing the right fan manager and governors might save me another $30 a month.

    I can run falcon 180b unquantized and still have tons of ram left over.

    • fallingdowndizzyvr@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      also my electric bill went from $40 a month to $228 a month

      I take it you live in a low cost electricity area if your bill was $40 before that. Where I live, people can pay 10 times that even if they just live in an apartment. So in high cost areas like mine, the power and thus electricity cost savings for something like a Mac would end up paying for it.

    • Aaaaaaaaaeeeee@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      No way, you’re that one guy I uploaded the f16 airoboros for ! I was hoping you’d get the model and I think you did it :)