[Math] What are possible applications of deep learning to research mathematics

big-listexperimental-mathematicsgm.general-mathematicsmachine learningsoft-question

With no doubt everyone here has heard of deep learning, even if they don't know what it is or what it is good for. I myself am a former mathematician turned data scientist who is quite interested in deep learning and its applications to mathematics and symbolic reasoning. There has been a lot of progress recently, and while it is exciting to machine learning experts, the results so far are probably not useful for research mathematicians. My question is simple:

Are there areas of research math, that if one had access to a fully trained state-of-the-art machine learning model (like the ones I will describe below), it would make a positive impact in that field?

While math is mostly about proofs, there is also a need sometimes for computation, intuition, and exploration. These are the things that deep learning is particularly good at. Let me provide some examples:

Good intuition or guessing

Charton and Lample showed that Transformers, a now very standard type of neural network, are good as solving symbolic problems of the form
$$ \mathit{expr}_1 \mapsto \mathit{expr}_2 $$
where $\mathit{expr}_1$ and $\mathit{expr}_2$ are both symbolic expressions, for example in their paper $\mathit{expr}_1$ was an expression to integrate and $\mathit{expr}_2$ was its integral. The model wasn't perfect. It got the right answer about 93-98% of the time and did best on the types of problems it was trained on. Also, integration is a 200 year old problem, so it is hard to outcompete a state-of-the-art CAS.

However, there are some things which make this interesting. Symbolic integration is important, difficult, and (somewhat) easy to check that the solution is correct by calculating the derivative (and checking symbolically that the derivative is equivalent to the starting integrand). Also it is an area where “intuition” and “experience” can definitely help, since a trained human integral solver can quickly guess at the right solution. Last it is (relatively) easy to compute an unlimited supply of training examples through differentiation. (This paper also uses other tricks as well to diversify the training set.)

Are there similar problems in cutting edge mathematics, possibly in algebra or combinatorics or logic where one would like to reverse a symbolic operation that is easy to compute in one direction but not the other?

Neural guided search

Some problems are just some sort of tree or graph search, such as solving a Rubik's cube. A neural network can, given a scrambled cube, suggest the next action toward solving it. A good neural network would be able to provide a heuristic for a tree or graph search and would prevent exponential blow-up compared to a naive brute-force search. Indeed, a paper in Nature demonstrated training a neural network to solve the Rubik's cube from scratch this way with no mathematical knowledge. Their neural network-guided tree search, once trained, can perfectly solve a scrambled cube. This is also similar to the idea behind AlphaGo and its variants, as well as the idea behind neural formal theorem proving—which is really exciting, but also not up to proving anything useful for research math.

Puzzle cubes and board games are not cutting edge math, but one could imagine more interesting domains where just like a Rubik's cube one has to manipulate one expression into another form through a series of simple actions, and the ability to do that reliably would be of great interest. (Note, the neural guided tree search I've described is still a search algorithm and much slower than a typical cube solving algorithm, but the emerging field of program synthesis could possibly one day soon learn from scratch a computer program which solves the Rubik's cube, as well as learn computer programs to solve more interesting math problems.)

Neural information retrieval

Suppose you have a database of mathematical objects (say, integer sequences, finite groups, elliptic curves, or homotopy groups as examples) and you want a user to be able to look up an object in this database. If that object is in the database, you would like the user to find it. If it is not, you would like the user to find similar objects. The catch is that "similar" is hard to define. Deep learning provides a good solution. Each object can be associated with an embedding, which is just a vector in, say, $\mathbb{R}^{128}$. The measure of similarity of two objects is just the inner product of their embeddings. Moreover, there are a number of techniques using self-supervised machine learning to construct these embeddings so that semantically similar objects have similar embeddings. This has already shown a lot of promise in formal theorem proving as premise selection where one wants to pick the most relevant theorems from a library of theorems to use in the proof of another theorem.

For example, I think such a neural database search could reasonable work for the OEIS, where one can use a neural network to perform various prediction tasks on an integer sequence. The inner layers of the trained network will compute a vector embedding for each sequence which can be used to search through the database for related sequences.

Geometric intuition

Neural networks are pretty good at image recognition and other image tasks (like segmentating an image into parts). Are there geometry tasks that it would be useful for a deep learning agent to perform, possibly in dimensions 4 or 5, where human geometric intuition starts to fail us since we can't see in those dimensions. (It would be hard to make, say a convolutional neural network work for a 4 dimensional image directly, but I could imagine representing a 3D surface embedded in 4D as say a point cloud of coordinates. This could possibly work well with Transformer neural networks.)

Build your own task

Neural networks are very flexible when it comes to the choice of input and output, as well as the architecture, so don’t let the specific examples above constrain your thinking too much. All you need are the following things:

  1. A type of mathematical object. One that you care about. It should be representable in some finite way (as a formula, an initial segment of a sequence, an image, a graph, a computer program, a movie, a list of properties).
  2. A task to perform on your operation. It can be well specified or fuzzy. It can be solvable, or (like integration) only partially solvable. It can be classifying your objects into finitely many buckets. It could be computing some other related object or property of the object. It could be finding the next element in a sequence. It could be coming up with a prediction or conjecture of some sort. It could be turning that object into a some 2D cartoon image even, just to think outside the box.
  3. Lots of training data. Either you need to be able to synthetically generate a lot of training examples as in the integration example above, or like OEIS have a large dataset of 10s of thousands or examples (more examples is always better). Clean data is preferred but neural networks handle messy data very well. Another “data free” solution is reinforcement learning like in the Rubik’s Cube example or AlphaGoZero, where the agent learns to explore the problem on its own (and generates data through that exploration).
  4. Patterns in the data, even if you can’t see what they are. Your task should be one where there are patterns which help the machine learning agent solve the problem. (For example, I’m not convinced that factoring large integers would be a good task.)
  5. Motivation. Why would this be useful to the field? What purpose would having this trained model have? Would it make it easier to conjecture facts, explore new areas of math, wrap ones head around a bunch of confusing formulas? Or do you have a way to turn a learned heuristic into a proof, such as with a search algorithm (as in the Rubik’s cube example above) or by checking the solution (as in the integration example above)?

Best Answer

In the context of algebraic geometry, neural networks have become useful tools in the study of Calabi-Yau manifolds. The computation of their topological invariants, metrics and volumes is of particular interest for applications in physics (string theory). Recent contributions include


For a more general overview, see Machine-Learning Mathematical Structures

We review, for a general audience, a variety of recent experiments on extracting structure from machine-learning mathematical data that have been compiled over the years. Focusing on supervised machine-learning on labeled data from different fields ranging from geometry to representation theory, from combinatorics to number theory, we present a comparative study of the accuracies on different problems. The paradigm should be useful for conjecture formulation, finding more efficient methods of computation, as well as probing into certain hierarchy of structures in mathematics.