Solved – How do SOMs reduce dimensionality of data

dimensionality reductionself organizing maps

This is a problem with which I have been grappling with for days. From my research on self-organizing maps, I know that a common feature of self organizing maps is to reduce the dimensionality of data. For example, if you had a 3×3 SOM, and an input space consisting of 50 10-Dimensional vectors, the SOM would reduce this to 50 2-Dimensional vectors. If I am creating my own SOM, where is this data? Please excuse me if my question too vague or broad. The reference vectors that are attached to each neuron in the SOM are the same dimension as the input space. The input space itself does not get reduced in dimensionality. So where is the reduced-dimension data? In other words, what data structure in relation to the self organizing map contains this data? My only guess is that this data could be found in the location of each node in the self organizing map.

Thanks!

Best Answer

The SOM grid is a 2-d manifold or topological space onto which each observation in the 10-d space is mapped via its similarity with the prototypes (code book vectors) for each cell in the SOM grid.

The SOM grid is non-linear in the full dimensional space; the "grid" is warped to more-closely fit the input data during training. However, the key point in terms of dimension reduction is that distances can be measured in the topological space of the grid - i.e. the 2 dimensions - instead of the full $m$-dimensions. (Where $m$ is the number of variables.)

Simply, the SOM is a mapping of the $m$-dimensions onto the 2-d SOM grid.

Related Question