I am talking about this particular model:
https://huggingface.co/TheBloke/goliath-120b-GGUF
I specifically use: goliath-120b.Q4_K_M.gguf
I can run it on runpod.io on this A100 instance with “humane” speed, but it is way too slow for creating long form text.
These are my settings in text-generation-webui:
Any advice? Thanks
or open the UI, go to model page, right click on the layers slider -> inspect element
and update max value for the input field from 128 to 256
Cant believe that worked lol! Thank you so much. The speed increased significantly!
I mean it makes sense The value is chosen we’re simply chosen for being a reasonable window at the time.
There was nothing hard coded about them they were simply a range of values that they had set for the UI.
It certainly is interesting though.