• 0 Posts
  • 10 Comments
Joined 1 year ago
cake
Cake day: October 30th, 2023

help-circle

  • if you have the ram don’t worry about disk at all. if you have to drop to any kind of disk even if it’s gen 5 ssd you speeds will tank. memory bandwidth matters so much more than compute for LLMs, but it all depends on your needs. there are probably cheaper ways to go about this if you just need something occasionally. maybe runpod or something, but if you need a lot of inference then locally could save you money, but renting a big machine with a100s will always be faster. so will a 7B model do what you need or do you need the accuracy and comprehension of a 70b or one of the new 120b merges? also llama3 is supposed to be out in jan/feb and if it’s significantly better then everything changes again.







  • my setup

    EPYC Milan-X 7473X 24-Core 2.8GHz 768MB L3

    512GB of HMAA8GR7AJR4N-XN HYNIX 64GB (1X64GB) 2RX4 PC4-3200AA DDR4-3200MHz ECC RDIMMs

    MZ32-AR0 Rev 3.0 motherboard

    6x 20tb WD Red Pros on ZFS with zstd compression

    SABRENT Gaming SSD Rocket 4 Plus-G with Heatsink 2TB PCIe Gen 4 NVMe M.2 2280

    you can probably get away with a non-x without really an performance difference. it might make a difference in very tiny models, but that’s not the point of getting such a beastly machine.

    I got the Milan-X because I also use it for cad, and circuit board development, and gaming, and video editing so it’s an all in one for me.

    also my electric bill went from $40 a month to $228 a month, but some of that is because I haven’t setup the suspend states yet and the machine isn’t sleeping the way I want it to yet. I just haven’t gotten around to it. i imagine it would cut the bill in half, and then maybe choosing the right fan manager and governors might save me another $30 a month.

    I can run falcon 180b unquantized and still have tons of ram left over.