• On_Mt_Vesuvius@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    From statistical learning theory, there is always some adversarial distribution where the model will fail to generalize… (no free lunch). And isn’t generalization about extrapolation beyond the training distribution? So learning the training distribution itself is not generalization.

    • dragosconst@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      The No free lunch theorem in Machine Learning refers to the case in which the hypothesis class contains all possible classifiers in your domain (and your training set is either too small, or the domain set is infinite), and learning becomes impossible to guarantee, i.e. you have no useful bounds on generalization. When you restrict your class to something like linear classifiers, for example, you can reason about things like generalization and so on. For finite domain sets, you can even reason about the “every hypothesis” classifier, but that’s not very useful in practice.

      I’m not sure about your point about the training distribution. In general, you are interested in generalization on your training distribution, as that’s where your train\test\validation data is sampled from. Note that overfitting your training set is not the same thing as learning your training distribution. You can think about stuff like domain adaptation, where you reason about your performance on “similar” distributions and how you might improve on that, but that’s already something very different.