Often when I read ML papers the authors compare their results against a benchmark (e.g. using RMSE, accuracy, …) and say “our results improved with our new method by X%”. Nobody makes a significance test if the new method Y outperforms benchmark Z. Is there a reason why? Especially when you break your results down e.g. to the anaylsis of certain classes in object classification this seems important for me. Or do I overlook something?

  • neo_255_0_0@alien.top
    cake
    B
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    You’d be surprised to know that most academia is so focused on publishing that the rigor is not even a priority. Forget reproducibility. This is in part because repeated experiments would indeed require more time and resources both of which are a constraint.

    That is why the most good tech which can be validated is produced by the industry.