I’m trying to teach a lesson on gradient descent from a more statistical and theoretical perspective, and need a good example to show its usefulness.

What is the simplest possible algebraic function that would be impossible or rather difficult to optimize for, by setting its 1st derivative to 0, but easily doable with gradient descent? I preferably want to demonstrate this in context linear regression or some extremely simple machine learning model.

  • Hothapeleno@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Find the minimum or maximum of an n-dimensional matrix of data instead of a function. Generate the data with any nice to visualise function. The move on from gradient descent to simulated annealing.