The new chat model released by Intel is now at the top of the OpenLLM leaderboard (among the 7B models).
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
The new chat model released by Intel is now at the top of the OpenLLM leaderboard (among the 7B models).
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
same, i found it tends to give short response.
But are the short responses more correct?
Exactly. It didn’t hallucinate even once in my tests. I used RAG and it gave me perfect to-the-point answers. But I know most people want more verbose outputs it’s just that it’s good for factual retrieval use cases.
Maybe for RAG, short answer is less possible for hallucination?I will test more. thanks
This is a fine-tuned/instruction-tuned model. Explicit system prompts or instructions like “generate a long, detailed answer” can make the model generate longer responses. 🙂
–Kaokao, AI SW Engineer @ Intel