• ttkciar@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    It is only 1.3B :-) I have noticed that smaller models work a lot better with longer, more detailed prompts (at least 440 characters, better with twice that many).

  • LocoLanguageModel@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Try Setting temperature to .1

    Ive had really good luck with this model for 6.7b and and 33b. The 1.3 is more of a novelty because of how fast it runs on ancient GPUs, but not nearly as good as the other 2 sizes in my attempts, though it is amazing for its size.

  • FullOf_Bad_Ideas@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Is that a base or some instruct-tuned fine-tune? It wouldn’t be too much out of ordinary if it’s base, they tend to get crazy. You can try setting repetition penalty to 1, might help a touch.

    • AfterAte@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      Also, set the temperature to 0.1 or 0.2. those two things helped me getting it to work nicely.

  • vasileer@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    2 ideas

    - use deepseek-coder-1.3b-instruct not the base model

    - check that you use the correct prompting template for the model

    • East-Awareness-249@alien.topOPB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      It is the instruct model. You can see underneath the prompt box that it’s the deepseek-coder-1.3b-instruct_Q5_K_s model. I used the prompting template in the model, and it slightly improved answers.

      But if I ask if to write some code, it almost never does and says something gibberish.

      Does your GPU/CPU quality affect the AI’s output? My device is potato.