A simple example is when you have a set of n points, and you want to find a local maximum of “distance away from the closest point in the set”. The gradient is easy to approximate, and you can show visually what’s happening to students. There are algorithms to solve this exactly but they are much more complicated than gradient descent.
For one step more complexity you could do an svm algorithm. Ie, divide points into red and blue, and find the line that separates them into two groups, maximizing the margin to the closest point. This is a real problem for which gradient descent (well, some variant of it) is the best solution. It’s also nicely visualizable.
A simple example is when you have a set of n points, and you want to find a local maximum of “distance away from the closest point in the set”. The gradient is easy to approximate, and you can show visually what’s happening to students. There are algorithms to solve this exactly but they are much more complicated than gradient descent.
For one step more complexity you could do an svm algorithm. Ie, divide points into red and blue, and find the line that separates them into two groups, maximizing the margin to the closest point. This is a real problem for which gradient descent (well, some variant of it) is the best solution. It’s also nicely visualizable.