• justynasty@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    CausalLM (14B, llamafied) model is experimenting with something similar. These are the stories they can create. In my roleplay session, neural chat put me in the community of gypsy people, and it described their culture and customs like it happens in real life, this is an impressive model from Intel.

  • georgejrjrjr@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    The model seems cool and all, but the paper is better.

    Intel eliminated the preference data from direct preference optimization. Preference data is expensive and collecting it is a hassle, so this is a big deal. Best of all, it looks like their no-preference DPO actually performs better.

    The trick is sampling rejects from a small model. Let’s say you have a dataset of GPT-4 completions. You mark those as good (“preferred”). You prompt Llama 2 13B and mark its responses as rejects.

    Tl;dr This could boost the performance of nearly every model with a minimal increase in complexity (though obviously it’s non-zero compute).

        • Shoddy_Vegetable_115@alien.topB
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          Exactly. It didn’t hallucinate even once in my tests. I used RAG and it gave me perfect to-the-point answers. But I know most people want more verbose outputs it’s just that it’s good for factual retrieval use cases.

          • julylu@alien.topB
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 year ago

            Maybe for RAG, short answer is less possible for hallucination?I will test more. thanks

          • Intel@alien.topB
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 year ago

            This is a fine-tuned/instruction-tuned model. Explicit system prompts or instructions like “generate a long, detailed answer” can make the model generate longer responses. 🙂

            –Kaokao, AI SW Engineer @ Intel