[Math] Applications of algebraic geometry to machine learning

ag.algebraic-geometryst.statistics

I am interested in applications of algebraic geometry to machine learning. I have found some papers and books, mainly by Bernd Sturmfels on algebraic statistics and machine learning. However, all this seems to be only applicable to rather low dimensional toy problems. Is this impression correct? Is there something like computational algebraic machine learning that has practical value for real world problems, may be even very high dimensional problems, like computer vision?

Best Answer

One useful remark is that dimension reduction is a critical problem in data science for which there are a variety of useful approaches. It is important because a great many good machine learning algorithms have complexity which depends on the number of parameters used to describe the data (sometimes exponentially!), so reducing the dimension can turn an impractical algorithm into a practical one.

This has two implications for your question. First, if you invent a cool new algorithm then don't worry too much about the dimension of the data at the outset - practitioners already have a bag of tricks for dealing with it (e.g. Johnson-Lindenstrauss embeddings, principal component analysis, various sorts of regularization). Second, it seems to me that dimension reduction is itself an area where more sophisticated geometric techniques could be brought to bear - many of the existing algorithms already have a geometric flavor.

That said, there are a lot of barriers to entry for new machine learning algorithms at this point. One problem is that the marginal benefit of a great answer over a good answer is not very high (and sometimes even negative!) unless the data has been studied to death. The marginal gains are much higher for feature extraction, which is really more of a domain-specific problem than a math problem. Another problem is that many new algorithms which draw upon sophisticated mathematics wind up answering questions that nobody was really asking, often because they were developed by mathematicians who tend to focus more on techniques than applications. Topological data analysis is a typical example of this: it has generated a lot of excitement among mathematicians but most data scientists have never heard of it, and the reason is simply that higher order topological structures have not so far been found to be relevant to practical inference / classification problems.