[R] "It's not just memorizing the training data" they said: Scalable Extraction of Training Data from (Production) Language Models

wojcech@alien.top · 2 years ago

[R] "It's not just memorizing the training data" they said: Scalable Extraction of Training Data from (Production) Language Models

oldjar7@alien.top · 2 years ago

How is that a problem? The entire point of training is to memorize and generalize the training data.

narex456@alien.top · 2 years ago

Learning English is not simply memorizing a billion sample sentences.

The problem is that we want it to learn to string words together for itself, not regurgitate words which already appear in the training set in that order.

This paper attempts to solve the difficult dilemma of detecting how much of the success of an llm is due to rote memorization.

Maybe more importantly: how much parameter space/ training resources are wasted on this?