Geometric formulation of the subject of machine learning

machine learningsoft-question

Question:

what is the geometric interpretation of the subject of machine learning and/or deep learning?

Being "forced" to have a closer look at the subject, I have the impression that it all boils down to approximately reconstruct the characteristic function that decides about set-membership of the points of the feature space.

To me the essential "innovation" seems to be that a function must calculated on basis of the signum of its values for a certain sample of coordinates, that resemble the "training" set.

Say we are given finite sets $C\subset\mathcal{C}$ and $D\subset\bar{\mathcal{C}}$ my impression is that the goal is to find a function, whose zero-set partitions the feature space so that the signum of the function values complies with characteristic function of $\mathcal{C}$ with high probability; is that impression right?


**Addendum:**
my final "working assumption" about what Machine Learning, resp. Deep Learning is actually about, is to play a sophisticated version of [battleship](https://en.wikipedia.org/wiki/Battleship_(game)) in disguise.
The analogues of the ship geometries are more complicated shapes also in higher dimensions and spaces with non-euclidean distance metrics.
The whole subject of research is to devise methods to sink the opponent's ships faster.

Best Answer

There are several geometric aspects of machine learning.

  1. You can think of the goal of ML as function approximation. Your question mentions reconstructing a characteristic function, but people often look at approximating other types of functions as well ("regression" as opposed to "classification"). The geometry here comes from defining what approximation means, i.e., the "loss function."

  2. Many models (i.e., members of a parameterized family of functions) themselves have a natural kind of geometry: that is, you can look at distances between models based on their parameters, rather than the functions they compute. A Riemannian metric on parameters is useful in defining gradients for the learning process.

  3. In deep learning, one often looks at representations of data in a geometric way. A classic example is word embeddings, which associate words with points in a high-dimensional space, so that geometric operations (say, addition and subtraction) correspond to meaningful linguistic operations (say, analogical reasoning).

Taking a geometric point of view on ML is incredibly powerful! That said, it's also just one part of the subject, and thinking of ML only in terms of pure geometry is like thinking of probability only in terms of measure theory: not technically wrong, but not always the most helpful viewpoint.

Related Question