• sebo3d@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    In all honestly…i don’t know. I’ve used Turbo for role playing purposes A LOT and to me the model just seems to…get things better than most others and by that i mostly mean in terms of instructing it to behave a certain way. If i told it to generate 150 words, it generated 150(or close to that amount words). If i told him to avoid generating something specific, it avoided generating that thing(For example when i told Turbo to avoid roleplaying from user’s point of view, and it did just that while lower parameter models seem to ignore that). This is a behavior usually noticeable only in higher parameter models as lower parameter models seems to be visibly struggling with following very specific instructions, so that’s why i have a hard time believing that turbo is only 20B. It MIGHT be the dataset quality issue preventing lower parameter models from following more specific and complex instructions, but what Turbo displayed in my experience doesn’t scream “low” parameter model at all.

    • LiquidGunay@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      What has your experience with mistral been? Because going from llama 13B finetunes to mistral 7B, I found that it was remarkably better at following instructions (Prompt engineering finally felt like it was not just guessing and checking). Considering it is just a 7B, a 20B might be that good (It could also just be a MoE of 20B models)

      • sebo3d@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        I only really use Mistral Claude and Collective cognition but from perspective of a role player who uses LLMs mostly for just that my overall experience with Mistral(finetunes) has been mostly positive. 7B’s speed is undeniable, so this is a very major benefit it has over 13Bs and for a 7B it’s prose is excellent as well. What i also noticed about mistral models is that unlike 13Bs such as mythomax or remm-slerp they tend to pay closer attention to character cards as well as your own user description and will more commonly mention things stated in the said description.(For example my user description in SillyTavern had a note saying that my persona is commonly stalked by ghosts and model actually made a little joke about it saying “how are your ghostly friends are doing these days” which is something that NO 13B i used has done before) Still though 7B IS just 7B so model tends to hallucinate quite a bit, constantly tries to change the formatting of the roleplay and tends to roleplay as you unless you REALLY finetune the settings to borderline perfection so i have to swipe and/or edit responses quite a bit.