blimpyway@alien.topBtoMachine Learning@academy.garden•[R] "It's not just memorizing the training data" they said: Scalable Extraction of Training Data from (Production) Language ModelsEnglish
1·
1 year agoIt is not about being able to search for relevant data when prompted with a question.
The amazing thing is they seem to understand the question sufficiently so the answer is both concise and meaningful.
That’s what folks downplaying it as “a glorified autocomplete” are missing.
PS and those philosphising it can’t actually understand the question are also missing the point: nobody cares as long as its answers are sufficiently correct and meaningful as if it was understanding the question.
It mimics understanding well enough.
I said they mimic understanding well enough, that wasn’t a claim LLMs actually understand.
Sure training dataset limits apply,
And sure they very likely fail when the question is OOD, but figuring out the question is OOD isn’t that hard, so an honest “Sorry, your question is way too OOD” answer (instead of hallucinating) shouldn’t bee too difficult to implement.