First time testing local text model I don’t know much yet.I’ve seen people with 8GB cards complaining that text generation is very slow so I don’t have much hope about that but still… I think I need to do some configuration, when generating text my SSD is at 100% reading 1~2gb/s while my GPU does not reach 15% usage.
Using RTX 2060 6GB, 16GB RAM.
This is the model I am testing ( mythomax-l2-13b.Q8_0.gguf): https://huggingface.co/TheBloke/MythoMax-L2-13B-GGUF/tree/main

  • OverallBit9@alien.topOPB
    link
    fedilink
    English
    arrow-up
    1
    ·
    11 months ago

    Testing Q5 seems like the best at least for this GPU I use, but only on mythomax I’m not sure if other models would be the same.