Newton's Method🍎 for Constrained Optimization | Notes from YYYe's ORML Intensive Lectures
I used to think Newton’s method was a bit old-school. Turns out I was the one playing catch-up. This is about Newton’s method and a clever way to make it work for constrained optimization. Consider optimizing an unconstrained function $f(\cdot)$ whose Hessian exists. Newton’s method works around its gradient function, finding a point $x^\star$ such that $\nabla f(x^\star) = 0$. In an iterative way, at each point $x^k$, it approximates the gradient function $\nabla f(\cdot)$ around $x^k$ using its second-order derivative information $$ \nabla {\tilde f}(x^{k + 1}) \approx \nabla f(x^k) + \nabla^2 f(x^k)(x^{k + 1} - x^k), $$ and solve for $x^{k + 1}$ such that this approximation equals zero—which gives us the classic update: $$ x^{k + 1} = x^k - (\nabla^2 f(x^k))^{-1}\nabla f(x^k)....