What is the best current uncensored Storytelling LLM that can run with 32gb system ram and 8 gb Vram PC?

Acrobatic_Internal_2@alien.top · 1 year ago

What is the best current uncensored Storytelling LLM that can run with 32gb system ram and 8 gb Vram PC?

zware@alien.top · 1 year ago

If you want speed, you’ll want to use Mistral-7B-OpenOrca-GPTQ with ExLLama v2, that’ll give you around 40-45 tokens per second. TheBloke/Xwin-MLewd-13B-v0.2-GGUF to trade speed for quality (llama.cpp)