How bottlenecked are LLMs by CPU clock? (Budget options to host multiple GPUs)

Infinite100p@alien.top · 2 years ago

How bottlenecked are LLMs by CPU clock? (Budget options to host multiple GPUs)

0xd00d@alien.top · 2 years ago

I would imagine that this new option you’re talking about will be a good budget inference workhorse paired with multiple cards such as 3090s. 96 lanes of gen 5 will be a real enabler. That said, I think zen 2 epycs providing gen 4 lanes are cheaper still so there are good options available.

_Erilaz@alien.top · 2 years ago

3090 doesn’t support PCIE 5.0, only 4.0

The 4090 does, and it makes some sense to use them in x8 5.0 configuration, but only if you have a pallet of these GPUs.

JustOneAvailableName@alien.top · 2 years ago

I don’t even think lanes really matter when you’re not training.

ThisGonBHard@alien.top · 2 years ago

Pretty much not at all. The main bottleneck is memory speed.

I barely see a difference between 4 and 12 cores on 5900X when running on CPU.

When running multi GPU, the lanes are the biggest bottleneck.

On single GPU, CPU does not matter.

_Erilaz@alien.top · 2 years ago

8004 has six DDR5 channels afaik. That takes care of the memory bandwidth. The only issue would be an SP6 motherboard.

Worldly-Mistake-8147@alien.top · 2 years ago

Holy… 4x3090! No wonder it was hard to find my third one for reasonable price.

Imaginary_Bench_7294@alien.top · 2 years ago

So that really depends. You’re talking about running a multi gpu setup. If all of your model is in the gpu, then your processor will not be a bottleneck at all. The clock speed of the PCIe bus is independent of the cpu cores, unless you’re messing with overclocking. That’s why they advertise PCIe 3.0, 4.0, 5.0, etc. The PCIe version dictates the bandwidth per lane.

That being said, multi gpu setups do introduce some overhead. If a model is split between GPUs, the PCIe interface becomes a modest bottleneck as they pass data back and forth. The greater the number of GPU’s the model is split across, the greater the bottleneck.