Mathematics research relating to machine learning

computer sciencemachine learningpr.probabilityreference-requestst.statistics

What branch/branches of math are most relevant in enhancing machine learning (mostly in terms of practical use as opposed to theoretical/possible use)? Specifically, I want to know about math research used to improve machine learning.

It would be very helpful if you could direct me to some books/blogs/survey papers.

Best Answer

Machine learning is a huge area, and so draws from many different parts of math. Hence, you might get multiple answers emphasizing different things.

First, the linked thread Mathematics for machine learning is about what math someone should learn before diving into machine learning. That's different from what research might be most impactful. Still, that thread mentions optimization, prob/stat, linear algebra, harmonic/fourier analysis, approximation theory, topology, embedding theory, functional analysis, and control theory. Another recent thread asked about connections between higher categories and machine learning.

Another resource that might be of interest is Data Science for Mathematicians, edited by Nathan Carter. It assumes the audience is a mathematician (at, say, the graduate student level), then gives high level treatments of:

  • programming with data,
  • linear algebra (and its applications to data analytics),
  • basic statistics,
  • clustering,
  • operations research,
  • dimensionality reduction,
  • machine learning,
  • deep learning, and
  • topological data analysis

I should disclose that I wrote one of the chapters, but don't have any financial stake in the book. I recommend it because I think it's great, and will help mathematicians who want to embrace data science in their research, teaching, or as an alternative career.

In terms of "enhancing machine learning," there are several directions, listed below in no particular order:

  1. Finding faster algorithms for basic operations like matrix multiplication, SVD, etc. Also, design and analysis of algorithms for big data, including parallel computing and streaming computation. What's the "right" notion of computational complexity here and how can you analyze such algorithms? Not a ton has been done.
  2. Proving impossibility results to prevent people from going down bad avenues.
  3. Improving the foundations of network theory, databases, etc. Some of this is discussed in my book (joint with Tom Bressoud) Introduction to Data Systems.
  4. Improving the mathematical foundations underlying causality theory (start with causal networks but then look at how people try to infer causal relationships even with observational data, and do the math necessary to see that such an argument actually works and doesn't miss confounding sources, according to the probability distributions involved).
  5. Differential privacy.
  6. Come up with a better / richer mathematical model to explain what is going on inside a neural network. Right now, one of the biggest issues is that machine learning algorithms output an answer and we can't figure out why. This is a worthy research area for mathematicians to get involved in.
  7. There's still plenty of research to do in probability or statistics, relevant to machine learning. Like, developing new statistical tests for significance, measurements of effect size, etc.

It might help to poke around on arXiv and find papers doing the kind of thing you're interested in, then use Google Scholar to look up other papers by the same authors, or look up their webpages and research groups. There are also folks working on these kinds of questions from outside of a university setting, like the Topos Institute. Because the number of ways to do great research that enhances machine learning is vast, it's best to pick something concrete and get to work, instead of trying to understand every possible avenue before starting. That said, one very valuable thing academic mathematicians can bring to the world of machine learning is a "big picture" view, so even as you're working on concrete problems, stop every so often to ponder big questions and think about the major issues with machine learning today, and how math could help model, streamline, explain, validate, and make predictions related to those major issues.