RAG - Vectara's Hallucination leaderboard

AdamDhahabi@alien.top · 10 months ago

RAG - Vectara's Hallucination leaderboard

Wonderful_Ad_5134@alien.top · 10 months ago

“llama2 7b > llama2 13b”

lol

LoSboccacc@alien.top · 10 months ago

Oof 3% is a lot

FullOf_Bad_Ideas@alien.top · 10 months ago

I don’t think they actually tested base models. Look at the description of their methods - they don’t run the models themselves, they only use public apis They say they used mistral-instruct, not Mistral. Those are not the same models, you shouldn’t put “Mistral” in the table if you ran tests on “Mistral-Instruct”. There is no information what actual model was used for llama test, or the output of the test. I suspect that they used llama-2-chat models which were RHLFed. Mistral Instruct is not RHLFed. It’s likely that RHLF can reduce hallucination rate and we are seeing it’s effects.

aaronr_90@alien.top · 10 months ago

Noob question: What is the recommended method to interact with a non finetuned/chat model?

Distinct-Target7503@alien.top · 10 months ago

How is possible that Llama2 13B and 7B have lower hallucination rate than Claude?