Is DeepSeek Coder 1.3b meant to be this bad?

East-Awareness-249@alien.top · 2 years ago

Is DeepSeek Coder 1.3b meant to be this bad?

ButlerFish@alien.top · 2 years ago

Looks like a very small model. Maybe better for a code completion usecase.

ttkciar@alien.top · 2 years ago

It is only 1.3B :-) I have noticed that smaller models work a lot better with longer, more detailed prompts (at least 440 characters, better with twice that many).

LocoLanguageModel@alien.top · 2 years ago

Try Setting temperature to .1

Ive had really good luck with this model for 6.7b and and 33b. The 1.3 is more of a novelty because of how fast it runs on ancient GPUs, but not nearly as good as the other 2 sizes in my attempts, though it is amazing for its size.

FullOf_Bad_Ideas@alien.top · 2 years ago

Is that a base or some instruct-tuned fine-tune? It wouldn’t be too much out of ordinary if it’s base, they tend to get crazy. You can try setting repetition penalty to 1, might help a touch.

AfterAte@alien.top · 2 years ago

Also, set the temperature to 0.1 or 0.2. those two things helped me getting it to work nicely.

vasileer@alien.top · 2 years ago

2 ideas

- use deepseek-coder-1.3b-instruct not the base model

- check that you use the correct prompting template for the model

East-Awareness-249@alien.top · 2 years ago

It is the instruct model. You can see underneath the prompt box that it’s the deepseek-coder-1.3b-instruct_Q5_K_s model. I used the prompting template in the model, and it slightly improved answers.

But if I ask if to write some code, it almost never does and says something gibberish.

Does your GPU/CPU quality affect the AI’s output? My device is potato.