I use in both cases q4_K_M

  • out_of_touch@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I’m curious what results you’re seeing from the Yi models. I’ve been playing around with LoneStriker_Nous-Capybara-34B-5.0bpw-h6-exl2 and more recently LoneStriker_Capybara-Tess-Yi-34B-200K-DARE-Ties-5.0bpw-h6-exl2 and I’m finding them fairly good with the right settings. I found the Yi 34B models almost unusable due to repetition issues until I tried settings recommended in this discussion:

    https://www.reddit.com/r/LocalLLaMA/comments/182iuj4/yi34b_models_repetition_issues/

    I’ve found it much better since.

    I tried out one of the neural models and found it couldn’t keep track of details at all. I wonder if my setting weren’t very good or something. I would have been using a EXL2 or GPTQ version though.

    • TeamPupNSudz@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      11 months ago

      I found the Yi 34B models almost unusable due to repetition issues until I tried settings recommended in this discussion:

      I have the same issue with LoneStriker_Nous-Capybara-34B-5.0bpw-h6-exl2. Whole previous messages will often get shoved into the response. I basically gave up and went back to Mistral-OpenHermes.

    • bacocololo@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      11 months ago

      To stop any repetition. you could try to add a stop token in model as ‘### Human’ it works well for me

      • TeamPupNSudz@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        11 months ago

        Capybara doesn’t use Alpaca format, so that wouldn’t do anything. Regardless, it’s not that type of repetition. It’s not speaking for the user, it’s literally just copy/pasting part of the conversation into the answer.

    • USM-Valor@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      11 months ago

      I’ve had the same experiences with the Yi finetunes. I tried them on single-turn generations and they were very promising. However, starting with one from scratch I was having a ton of repetition and looping. Some models need a very tight set of parameters to get them to perform well, whereas other ones will function will under almost any sane set of guidelines. I’m thinking Yi leans more towards the former, which will have users thinking they are inferior to simpler, but more flexible models.