First time testing local text model I don’t know much yet.I’ve seen people with 8GB cards complaining that text generation is very slow so I don’t have much hope about that but still… I think I need to do some configuration, when generating text my SSD is at 100% reading 1~2gb/s while my GPU does not reach 15% usage.
Using RTX 2060 6GB, 16GB RAM.
This is the model I am testing ( mythomax-l2-13b.Q8_0.gguf): https://huggingface.co/TheBloke/MythoMax-L2-13B-GGUF/tree/main

  • OverallBit9@alien.topOPB
    link
    fedilink
    English
    arrow-up
    1
    ·
    11 months ago

    In my tests Q4 is giving me the same amount of tokens as Q5 so I decided to use Q5, first time tesint text gen locally with models, thank you very much for explaining I am getting used to it now and understanding what the settings do.

    • Civil_Ranger4687@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      11 months ago

      Yeah there’s so much to learn I’m still figuring a lot out too.

      Good tip for settings: Play around mostly with temperature, top-p, and min-p.