So Mistral-7b is a pretty impressive 7B param model … but why is it so capable? Do we have any insights into its dataset? Was it trained very far beyond the scaling limit? Any attempts at open reproductions or merges to scale up # of params?

  • meetrais@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I second this. Mistral-7B gave me good results. After fine-tuning it’s result is even better.

    • kaszebe@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Mistral-7B gave me good results

      Can you expand upon that? Do you mean in terms of its ability to write at a college level without major grammatical errors?

    • PwanaZana@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Are there notable finetunes to your knowledge? I’ve started using LLMs today, starting with openorca mistral 7B and it seems pretty good.

      • meetrais@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        On HuggingFace you can find many fine-tuned/quantized models. Look for models from TheBloke on HuggingFace.