Communick News
  • Communities
  • Create Post
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
Life_Ask2806@alien.topB to LocalLLaMA@poweruser.forumEnglish · 2 years ago

in the context of evaluating LLMs, what do these scores technically mean?

message-square
message-square
3
link
fedilink
1
message-square

in the context of evaluating LLMs, what do these scores technically mean?

Life_Ask2806@alien.topB to LocalLLaMA@poweruser.forumEnglish · 2 years ago
message-square
3
link
fedilink

when we benchmark different LLMs on different datasets (MMLU, TriviaQA, MATH, HellaSwag, etc.), what are the the signification of these scores? the accuracy? another metric? how can i know the metrics of each dataset (MMLU, etc.)

https://preview.redd.it/5glmddnwsb3c1.png?width=2158&format=png&auto=webp&s=fcaf6e55d62445f3007380f06649455b29f8b2ec

  • shaman-warrior@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 years ago

    Everything is common sense reasoning, we need better definitions

LocalLLaMA@poweruser.forum

localllama@poweruser.forum

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !localllama@poweruser.forum

Community to discuss about Llama, the family of large language models created by Meta AI.

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 4 users / day
  • 1 user / week
  • 4 users / month
  • 4 users / 6 months
  • 1 local subscriber
  • 4 subscribers
  • 1.02K Posts
  • 5.81K Comments
  • Modlog
  • mods:
  • communick@poweruser.forum
  • BE: 0.19.11
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org