• a_beautiful_rhind@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    For spicy I use the same one as airoboros 3.1, which I think is llama 2 chat. Have alpaca set in the telegram bot and nothing bad happened.

    On larger better models the prompt format isn’t really that serious. If you see it giving you code or extra stuff, you try another one till it does what it’s supposed to.

    • FullOf_Bad_Ideas@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      I am getting nice results in webui using exllama 2 loader and llama 2 prompt. Problem is that webui gives me 21 t/s while when using chat.py from exllama directly I get 28.5 t/s. The difference is too big to make me use webui. I tried matching sampler settings, bos, system prompt and repetition penalty but it still has issues there - it either mixes up the prompt, for example outputting <>, prints out a whole-ass comment section to a story, outputs 30 links to YT out of nowhere and generally still acts a bit like a base model. I can’t really blame exllama v2, because my lora works more predictably. I also can’t blame spicyboros, because it works great in webui. It looks the same with raw, llama and chatml prompt formats. It’s not a big deal since it’s still usable, but it bugs me a bit.