I’m trying to teach a lesson on gradient descent from a more statistical and theoretical perspective, and need a good example to show its usefulness.
What is the simplest possible algebraic function that would be impossible or rather difficult to optimize for, by setting its 1st derivative to 0, but easily doable with gradient descent? I preferably want to demonstrate this in context linear regression or some extremely simple machine learning model.
Curious about this answer, I use AI to help me understand a lot of what happens with questions like these, but as its not my expertise, I am always suspect of the answers i get. So I sent your question to Phind:
A good example of a function that is difficult to optimize by setting its first derivative to zero, but can be easily optimized using gradient descent, is the function f(x) = x*ln(x) - sqrt(x)
The derivative of this function is f’(x) = ln(x) + 1 - 1/(2*sqrt(x)). Setting this equal to zero does not yield a solution that can be expressed in terms of elementary functions. However, we can easily apply the gradient descent method to find a numerical solution.
*********************************
it also generated a bunch of example code in Python. Curious as to the validity of the response.