I work on convex optimization with Stephen Boyd's book.
As an example, support vector machines are mentioned as an application of separating hyperplanes theorem. I am wondering if there is any other application of convex optimization to artificial intelligence.
I am very unaware of AI methods, but as far as I know it involves minimizing functions. Does any of them involve minimizing a convex (or quasi-convex, log-convex) function?
Best Answer
Many models used in machine learning are either continuous optimization problems (linear / logistic regression, simple neural networks, SVMs) or pieces of it are continuous (more complicated neural networks, regression trees, etc). A huge part of convex optimization is learning foundational methods to solve continuous and convex problems.
One common complaint is that many ML problems may be nonconvex (neural networks being the main one). However, many of the current tools used to solve nonconvex problems are either the same or small variants of the traditional tools presented for continuous convex problems, such as
In fact, methods like stochastic gradient descent, and even asynchronous versions (which are all the craze in ML today) appeared as early as Bertsekas' Nonlinear Programming book in the 1990s, which was primarily aimed at convex optimization. And, methods like ADMM were proposed and analyzed in the 70s. I don't see them appearing too much in industrial ML, but in theoretical ML they are favored for their ease in decomposing large problems and exploiting sparsity. A key concept you will get from Boyd/Vandenberghe that you might not get from ML blogs is concept from convex optimization that is relevant in ML is duality and dual forms, mostly because of SVMs. (For kernel SVMs, we always solve the dual form.)
Therefore, learning about convex optimization, even though it is restricted to convex problems, can be extremely helpful and foundational in understanding the newer, "hotter" methods of today.
That being said...
There are several key differences between learning ML and learning optimization (convex or otherwise), and I think it is important to be aware of them, so you don't put too much emphasis on skills that my have less relevance to ML. To me, they are
I'm guessing I wrote a lot of controversial things; I have worked on convex optimization for 7 years now and ML for the past 3 years. I suppose someone with a different background may have an entirely different perspective.