
- 0 Posts
- 4 Comments
jamochasnake@alien.topBtoMachine Learning@academy.garden•[N] Sam Altman and Greg Brockman, together with colleagues, will join Microsoft to lead new advanced AI research teamEnglish1·2 年前
jamochasnake@alien.topBtoMachine Learning@academy.garden•[D] Simplest mathematical example of a function that can only be solved by gradient descentEnglish1·2 年前ignoring the fact that you can’t prove anything about gradient descent for nonsmooth functions (you need subgradients), a finite step size will NOT work. You can end up at the minimum and compute a subgradient which is different from zero there, and end up moving AWAY from the minimizer. You need to decay your step size to ensure that you end up at the minimizer (asymptotically).
jamochasnake@alien.topBtoMachine Learning@academy.garden•[N] Sam Altman and Greg Brockman, together with colleagues, will join Microsoft to lead new advanced AI research teamEnglish1·2 年前it’s not just sam, he’s bringing talent with him. people like jakub pachocki.
jamochasnake@alien.topBtoMachine Learning@academy.garden•[D] Simplest mathematical example of a function that can only be solved by gradient descentEnglish1·2 年前of course, every locally lipschitz function is differentiable almost everywhere, i.e., the set of points where it is not differentiable is a measure zero set. So if you drew a point randomly (with distribution absolutely continuous w.r.t. the lebesgue measure) then you avoid such points with probability. However, this has NOTHING to do with actually performing an algorithm like gradient descent - in these algorithms you are NOT simply drawing points randomly and you cannot guarantee you avoid these points a priori.