Running multiple GPUs requires PCIe lanes. Consumer PCs have too few of those to even run 2x GPUs at full bandwidth (2x16).

Threadrippers are prohibitively expensive for many.

AMD have announced EPYC 8004 Siena in September. These low-power server CPUs start at 8 cores @ ~$400 and offer 96 lanes. The catch is that the clock is pretty low.

So, the question is: How bottlenecked are LLMs by CPU clock?

I.e., would it make much of a difference if you run 4x 3090s on the $2000+ Threadripper vs $400 Epyc 8004?

  • 0xd00d@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I would imagine that this new option you’re talking about will be a good budget inference workhorse paired with multiple cards such as 3090s. 96 lanes of gen 5 will be a real enabler. That said, I think zen 2 epycs providing gen 4 lanes are cheaper still so there are good options available.

    • _Erilaz@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      3090 doesn’t support PCIE 5.0, only 4.0

      The 4090 does, and it makes some sense to use them in x8 5.0 configuration, but only if you have a pallet of these GPUs.