Solved – Is a Gaussian-Gaussian RBM just a linear model

deep learningdeep-belief-networksmachine learningrestricted-boltzmann-machine

The 'conventional' configuration of RBMs are Binary-Binary and Gaussian-Binary (and sometimes Binary-Gaussian) units.

Although it is possible for both the visible and hidden units to be gaussian, wouldn't a Gaussian-Gaussian RBM just resemble a linear model, since there is no non-linearity in the networks units anymore? Thus, stacking them would not not have the same benefit as for, say, Binary-Binary RBMs.
And when using them for dimensionality reduction, a simple PCA would achieve better results?

Am I missing any significant points in the training of RBMs or are Gaussian-Gaussian RBMs just that limited?

Best Answer

First, notice that you can fix the variance of the hidden units to 1, since the weight matrix will scale them arbitrarily.

Then:

  • If you learn the variance of each visible unit, you get factor analysis
  • If the variance of the visible units is tied, you get is PPCA (of the demeaned data)
  • If you fix the variance of the visible units to a small value, you get pure PCA.

In the last two cases, the weight matrix, $W$ will correspond to the leading eigenvectors of the data correlation matrix, up to a rotation.

Stacking several layers is not equivalent to having a larger layer. The distribution of all the units is still jointly Gaussian, but the connectivity restricts the covariance matrix to a certain structure.