[R] "It's not just memorizing the training data" they said: Scalable Extraction of Training Data from (Production) Language Models

wojcech@alien.top · 2 years ago

[R] "It's not just memorizing the training data" they said: Scalable Extraction of Training Data from (Production) Language Models

UnknownEssence@alien.top · 2 years ago

If it is truly memorizing the ENTIRE set of training data, then is it not lossless data compression that is much more efficient than any known compression algorithms?

It has to be lossy compression aka it doesn’t remember its ENTIRE set of training data, word for word.