Slightly off-topic – I’ve been testing 13b and 7b models for awhile now… and I’m really interested if people have a good one to check out, because at least for now, I’ve settled on a 7b model that seems to work better than most other 13b models I’ve tried.
Specifically, I’ve been using OpenChat 3.5 7b (Q8 and Q4) and it’s been really good for my work so far, and punching much higher than it’s current weight class… – Much better than any of the 13b models I’ve tried. (I’m not doing any specific tests, it just seems to understand what I want better than others I’ve tried. – I’m not doing any function calling but even the 4bit 7b model is able to generate JSON as well as respond coherently.)
Note: specically using the original (non-16k) models; the 16k models seem to be borked or something?
I agree, it’s my favourite 7b model too. I use it mainly to help me with bot personalities. It’s too bad it’s not really fine-tuned for roleplay, otherwise it would be wrecking. And yes, 16k is broken for me too.
In general I think it would be nice if people tried to mix several Mistral models more often, as it was with the Mistral-11B-CC-Air-RP. Yes, it has serious problems with understanding the context and the characters go into psychosis, but if you use a small quantization (like q 5-6) and minimum P parameter, it improves the situation a bit. It’s just that apparently something went wrong when model merging. Otherwise, this model is really the most unique I’ve tried. Characters talk similarly to the early Character AI.
Slightly off-topic – I’ve been testing 13b and 7b models for awhile now… and I’m really interested if people have a good one to check out, because at least for now, I’ve settled on a 7b model that seems to work better than most other 13b models I’ve tried.
Specifically, I’ve been using OpenChat 3.5 7b (Q8 and Q4) and it’s been really good for my work so far, and punching much higher than it’s current weight class… – Much better than any of the 13b models I’ve tried. (I’m not doing any specific tests, it just seems to understand what I want better than others I’ve tried. – I’m not doing any function calling but even the 4bit 7b model is able to generate JSON as well as respond coherently.)
Note: specically using the original (non-16k) models; the 16k models seem to be borked or something?
Link: https://huggingface.co/TheBloke/openchat_3.5-GGUF
I agree, it’s my favourite 7b model too. I use it mainly to help me with bot personalities. It’s too bad it’s not really fine-tuned for roleplay, otherwise it would be wrecking. And yes, 16k is broken for me too.
In general I think it would be nice if people tried to mix several Mistral models more often, as it was with the Mistral-11B-CC-Air-RP. Yes, it has serious problems with understanding the context and the characters go into psychosis, but if you use a small quantization (like q 5-6) and minimum P parameter, it improves the situation a bit. It’s just that apparently something went wrong when model merging. Otherwise, this model is really the most unique I’ve tried. Characters talk similarly to the early Character AI.
https://huggingface.co/TheBloke/Mistral-11B-CC-Air-RP-GGUF/tree/main?not-for-all-audiences=true