dolphin-2.2-yi-34b released

Amgadoz@alien.top · 2 years ago

dolphin-2.2-yi-34b released

FullOf_Bad_Ideas@alien.top · 2 years ago

I am getting nice results in webui using exllama 2 loader and llama 2 prompt. Problem is that webui gives me 21 t/s while when using chat.py from exllama directly I get 28.5 t/s. The difference is too big to make me use webui. I tried matching sampler settings, bos, system prompt and repetition penalty but it still has issues there - it either mixes up the prompt, for example outputting <>, prints out a whole-ass comment section to a story, outputs 30 links to YT out of nowhere and generally still acts a bit like a base model. I can’t really blame exllama v2, because my lora works more predictably. I also can’t blame spicyboros, because it works great in webui. It looks the same with raw, llama and chatml prompt formats. It’s not a big deal since it’s still usable, but it bugs me a bit.