minus-squareMandelmus100@alien.topBtoMachine Learning@academy.garden•[R] "It's not just memorizing the training data" they said: Scalable Extraction of Training Data from (Production) Language ModelslinkfedilinkEnglisharrow-up1·1 year ago Another big takeaway is that training for more epochs leads to more memorization. Should be expected. It’s overfitting. linkfedilink
Should be expected. It’s overfitting.