Budget machine for tinkering with LLMs

the-uncle@alien.top · 1 year ago

Budget machine for tinkering with LLMs

ttkciar@alien.top · 1 year ago

You can absolutely do interesting and useful things with very little hardware, with quantized models, especially if you don’t mind if inference is slow. My preferred quantization is q4_K_M (with GGUF and llama.cpp).

I started with a spare Lenovo T560 Thinkpad with 8GB of RAM, which handled 7B models no problem. That’s a $120 eBay purchase. Once I was hooked, I shifted to one of the Dell T7910 in the homelab and moved up to larger models.

I’m still not using a GPU for anything. It’s been CPU inference, which is slow but otherwise great.

You could get just about any $300 desktop and put a decent GPU in it (16GB VRAM will allow fast inference with 13B models, and 24GB should allow heavily-quantized 30B) and enjoy fast inference. The most expensive bit is the GPU.

See this sub’s wiki for more detailed hardware tips.