For the public leaderboard in LLM, they tested of MLMU, ARC that kinds of dataset. What happen if I simply train my LLM on test set, how do you know I did that? I will get a model that rank high in the public leaderboard right?

  • wazis@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Yeah you can also hard code answer and score 100% on any test. But what’s the point?