lol we just unlocked a new paradigm, guaranteed we don’t hit a plateau for at least another two years. considering it looks like we’re probably already on the verge of one or two paradigm shifts on top of that, no real reason to anticipate a plateau in the immediate future regardless.
it’s possible to “overfit” to a subset of the data. generalization error going up is a symptom of “overfitting” to the entire dataset. memorization is functionally equivalent to locally overfitting, i.e. generalization error going up in a specific neighborhood of the data. you can have a global reduction in generalization error while also having neighborhoods where generalization gets worse.