Often when I read ML papers the authors compare their results against a benchmark (e.g. using RMSE, accuracy, …) and say “our results improved with our new method by X%”. Nobody makes a significance test if the new method Y outperforms benchmark Z. Is there a reason why? Especially when you break your results down e.g. to the anaylsis of certain classes in object classification this seems important for me. Or do I overlook something?

  • Current_Ferret_4981@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Are you going to make assumptions on the statistical distributions in order to make such tests accurate? Part of the reason nobody does this is because it’s arbitrary and irrelevant in many cases due to incorrect application of standardized methods. Combine that with the fact that it’s expensive to perform and has no real value for researchers it doesn’t really make sense.