Thank you! I’m reminded of variable bit rate encoding used in various audio and video formats, this sounds not dissimilar.
Thank you! I’m reminded of variable bit rate encoding used in various audio and video formats, this sounds not dissimilar.
Mein Gott do you ever sleep? Top work sir, thank you for all your efforts!
A few folks mentioning EXL2 here. Is this now the preferred Exllama format over GPTQ?
But how easy does it work with ooba nowadays? How about running two?
Any chance P40s can benefit from this through llama.cpp?
No chance of running this on P40s any time soon?
The issue is a lot of them have either Intel CPUs with the on board graphics, or AMDs CPUs… With on board graphics. Mini PCs with Nvidia GPUs are uncommon.
Zotac did some small PCs with Nvidia GPUs I think but I doubt any of them have much vram.
If you pair a mini pc with thunderbolt and connect it to a eGPU, that could be a setup that would work…
Tell me I’m going to need another GPU without telling me I’m going to need another GPU… Eeek.
Check the bios for 4G decoding and resizeable bar.
Any system you want to put multiple Tesla cards in needs to support Upper 4G and Rebar. Does that Dell system have either of these features in the bios?
Very useful!
32GB AMD Instinct cards for $500 would be a very compelling option… If the software stack wasn’t still such a ballache to get working, and if they were more common used.
Quadro M6000 24GB cards also seem semi common and relatively decently priced if one hunts carefully… But how well do they perform?