Covid-Plannedemic_@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

Training on the rephrased test set is all you need: 13B models can reach GPT-4 performance in benchmarks with no contamination detectable by traditional methods

1

Training on the rephrased test set is all you need: 13B models can reach GPT-4 performance in benchmarks with no contamination detectable by traditional methods

Covid-Plannedemic_@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

Catch me if you can! How to beat GPT-4 with a 13B model | LMSYS Org

Announcing Llama-rephraser: 13B models reaching GPT-4 performance in major benchmarks (MMLU/GSK-8K/HumanEval)! To ensure result validity, we followed Open...

Chat

ambient_temp_xeno@alien.topB
link
fedilink
English
arrow-up
1·
1 year ago
To be fair, it’s pretty clear that openai update their models with every kind of test people throw at them as well.