[Math] Group theory in machine learning

I'm a Machine Learning researcher who would like to research applications of group theory in ML.

There is a term "Partially Observed Groups" in machine learning theory which has been popularized by recent work to understand deep learning. The idea is simple, instead of learning a recognition function (image -> object class) , the brain is learning orbits (images -> object orbit under the action of a group).

For example, all images of a bottle are 2-d projections (thus partially observed) of a 3-d image in an orbit of a bottle under the action of a group (rotation or translation).

I'm having difficulty finding relevant literature about probability distributions on group elements or even the concept of partially observed groups. I was hoping perhaps this is a well developed concept (in maybe physics) by some other name. Any suggestions on relevant work? Or other literature I should read to guide me in this.

Thank you for your time.

clarification:

The hypothesis I'm exploring is that the representations learned in neural networks via gradient descent work as well as they do because they are group invariant. So literature on Lie groups is very relevant.

But the distinguishing setting is that since elements of the group act like transformations on 3-d images and since we only see a 2-d projection of the 3-d images, we must define probability distributions over the product (object x group element).

Therefore the problem is under-determined and distributions over group elements become necessary to describe what generated the observed image; Many objects can be transformed to produce very similar images. I thought there might be analogous work done in physics. I hope this clarifies things.

Best Answer

I suggest taking a look at the work of Ulf Grenander. His 1963 book laid out the basis for applying probability theory to groups (Chapter 4 is on stochastic Lie groups) and other algebraic structures. He continued to develop these ideas (see his later book) in the context of pattern recognition.

There is definitely newer work in this area (some of which is mentioned in other answers), but Grenander has been investigating these ideas for 40+ years and is worth looking into.

As an aside, Grenander's approach seems to be a bit more formal than a lot of contemporary machine learning research, which is a strength or weakness according to the reader's taste.

Best Answer

Related Solutions

Machine Learning – Mathematics for Machine Learning

[Math] highest weight orbit characterization (reformulated and extended )

Related Question