OverallBit9@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 2 years ago

Is there any way to speed up the MythoMax-L2-13B on a 6GB GPU?

1

Is there any way to speed up the MythoMax-L2-13B on a 6GB GPU?

OverallBit9@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 2 years ago

First time testing local text model I don’t know much yet.I’ve seen people with 8GB cards complaining that text generation is very slow so I don’t have much hope about that but still… I think I need to do some configuration, when generating text my SSD is at 100% reading 1~2gb/s while my GPU does not reach 15% usage.
Using RTX 2060 6GB, 16GB RAM.
This is the model I am testing ( mythomax-l2-13b.Q8_0.gguf): https://huggingface.co/TheBloke/MythoMax-L2-13B-GGUF/tree/main

Chat

Saofiqlord@alien.topB
link
fedilink
English
arrow-up
1·
2 years ago
Your issue is using q8. Be real, you only have 6gb of vram, not 24.

Your hardware can’t run q8 at a decent speed.

Use q4_k_s, you can offload much more to gpu. There’s degradation yes, but its not so bad.
- OverallBit9@alien.topOPB
  link
  fedilink
  English
  arrow-up
  1·
  2 years ago
  Yes, thanks for letting me know