Often when I read ML papers the authors compare their results against a benchmark (e.g. using RMSE, accuracy, …) and say “our results improved with our new method by X%”. Nobody makes a significance test if the new method Y outperforms benchmark Z. Is there a reason why? Especially when you break your results down e.g. to the anaylsis of certain classes in object classification this seems important for me. Or do I overlook something?

  • __Maximum__@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    They used to do this, I remember papers from 2015 the performance analysis of many papers were very comprehensive. They even provided useful statistics about used datasets. Now it’s “we used COCO and evaluated on 2017val. Here is the final number.” Unless the paper is about being better in certain classes, they will report the averaged percentage.