Is Open LLM Leaderboard reliable source ? yi:34B is at the top but I get better results with neural-chat:7B model

grigio@alien.top · 1 year ago

Is Open LLM Leaderboard reliable source ? yi:34B is at the top but I get better results with neural-chat:7B model

VertexMachine@alien.top · 11 months ago

It’s a source. But rarely synthetic benchmarks give you the whole picture. Plus those test sets are in the public, so there is some incentive for some people to game the system (and even without that those data sets most likely are already in the training data).

TobyWonKenobi@alien.top · 11 months ago

I’ve had the same experience. Are you using GGUF? I do, and I’ve heard that Yi may suffer from GGUF. So EXL2 might be better… I need to try it and see.