minus-squareseraphius@alien.topBtoMachine Learning@academy.garden•[R] "It's not just memorizing the training data" they said: Scalable Extraction of Training Data from (Production) Language ModelslinkfedilinkEnglisharrow-up1·11 months agoOn most tasks, memorization would be overfitting, but I think one would see that “overfitting” is task/generalization dependent. As long as accurate predictions are being made for new data, it doesn’t matter that it can cough up the old. linkfedilink
On most tasks, memorization would be overfitting, but I think one would see that “overfitting” is task/generalization dependent. As long as accurate predictions are being made for new data, it doesn’t matter that it can cough up the old.