Look at this, apart Llama1, all the other “base” models will likely answer “language” after “As an AI”. That means Meta, Mistral AI and 01-ai (the company that made Yi) likely trained the “base” models with GPT instruct datasets to inflate the benchmark scores and make it look like the “base” models had a lot of potential, we got duped hard on that one.

https://preview.redd.it/vqtjkw1vdyzb1.png?width=653&format=png&auto=webp&s=91652053bcbc8a7b50bced9bbf8638fa417387bb

  • phree_radical@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Interestingly, Mistral Instruct:

    As an AI
    
    ### top_k:
    
    0.686088: 13892 "assistant"
    0.049313: 28725 ","
    0.039010:  3842 "language"
    0.037810:  2229 "model"
    0.031591: 28733 "-"
    0.018000:  3332 "research"
    0.016518:  1587 "system"
    0.009266: 21631 "Assistant"
    0.006967:  7583 "expert"
    0.005598:  3921 "tool"
    0.004394:  8073 "agent"
    0.004242:   369 "that"
    0.002696:   304 "and"
    0.002644:   297 "in"
    0.001415:  5716 "student"
    0.001410:  5514 "technology"
    0.001197:  7786 "coach"
    0.001073:  1918 "team"
    0.001073: 24480 "scientist"
    0.001052:  2818 "based"
    0.001036:  2007 "program"
    0.000925: 12435 "bot"
    0.000819:  5181 "platform"
    0.000819: 28723 "."
    0.000816: 21782 "developer"
    0.000813:  6031 "assist"
    0.000806:  3327 "personal"
    0.000803:  9464 "algorithm"
    0.000776:  2488 "project"
    0.000746:   354 "for"
    0.000743:  8626 "teacher"
    0.000666:  7511 "eth"
    0.000645:  6953 "writer"
    0.000640: 24989 "practition"
    0.000623:  3441 "voice"
    0.000621:  5024 "professional"
    0.000611: 22275 "analyst"
    0.000588: 15589 "Language"
    0.000583:  8252 "virtual"
    0.000531:  7153 "digital"
    0.000525:   298 "to"
    0.000523: 11108 "technique"
    0.000523: 10706 "chat"
    0.000521: 19899 "specialist"
    0.000517:  8311 "tut"
    0.000501:  1338 "person"
    0.000493:  6878 "experiment"
    0.000474:   325 "("
    0.000460: 18112 "engineer"
    0.000458:  4993 "application"