• 0 Posts
  • 2 Comments
Joined 10 months ago
cake
Cake day: November 10th, 2023

help-circle

  • Among equally performing models the simples one is the best.

    If you want more theory look at statistical learning, eg “Understanding machine learning by shai ben-david”. There the idea is that we have data {(x_1, y_1), …, (x_n, y_n)}, where y_i is given by h(x_i), and we don’t know h, so we want to approximate it using the data. The approximation is selected from a family of functions (hypothesis class) H using a learning algorithm (typically ERM).

    Given infinite data, perhaps the best hypothesis class is the one which has the smallest VC dimension and contains the true function h. Then, you can estimate h pretty much perfectly.

    Given finite data, the best hypothesis class is perhaps the one whose complexity is just right for the given amount of data and its complexity.