wojcech@alien.topB to Machine Learning@academy.gardenEnglish · 1 year ago[R] "It's not just memorizing the training data" they said: Scalable Extraction of Training Data from (Production) Language Modelsarxiv.orgexternal-linkmessage-square30fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-link[R] "It's not just memorizing the training data" they said: Scalable Extraction of Training Data from (Production) Language Modelsarxiv.orgwojcech@alien.topB to Machine Learning@academy.gardenEnglish · 1 year agomessage-square30fedilink
minus-squarecegras@alien.topBlinkfedilinkEnglisharrow-up1·11 months agoWhat is the size of ChatGPT or the biggest LLMs compared to the dataset? (Not being rhetorical, genuinely curious)
minus-squareStartledWatermelon@alien.topBlinkfedilinkEnglisharrow-up1·11 months agoGPT-4: 1.76 trillion parameters, about 6.5* trillion tokens in the dataset. could be twice that, the leaks weren’t crystal clear. The above number is more likely though.
minus-squarezalperst@alien.topBlinkfedilinkEnglisharrow-up1·11 months agoTrillions of tokens, billions of parameters
What is the size of ChatGPT or the biggest LLMs compared to the dataset? (Not being rhetorical, genuinely curious)
GPT-4: 1.76 trillion parameters, about 6.5* trillion tokens in the dataset.
Trillions of tokens, billions of parameters