Solved – How is a Sparse RBM different from a Gaussian-Bernoulli RBM

machine learningrestricted-boltzmann-machine

Sparse RBM are described in this paper

Gaussian-Bernoulli RBMs are describe (kinda poorly) in this paper as Gaussian Units,
and more clearly in this masters thesis.

Sparse RBM: (Quoting direct from the paper, Page 3)
$$P(v_{i}|\mathbf{h})=\mathcal{N}(c_{i}+\sum_{j}w_{ij}h_{j},\,\sigma^{2})$$
$$P(h_{j}|\mathbf{v})=logistic\left(\frac{1}{\sigma^{2}}(b_{j}+\sum_{i}w_{ij}v_{i}\right)$$

Gaussian-Bernoulli RBMs (Quoting less directly from the masters thesis., pages 38 and 36. Some substitutions/expansions made)
$$v_{i}|\mathbf{h}\sim\mathcal{N}(c_{i}+\sum_{j}w_{ij}h_{i},\,\sigma_{i}^{2})$$
$$P(h_{j}=1|\mathbf{v})=logistic\left(b_{j}+\sum_{i}w_{ij}\frac{v_{i}}{\sigma_{i}^{2}}\right)$$

So the difference would be a scaling factor of $\frac{1}{\sigma}$ on $b_i$. and $b_i$ is a learned factor controlling the threshold, so will learn away a scaling factor.

They seem to have the same PDFs, though to formulation of the energy function is superficially different, I think it simplifies to be the same.

I the difference is $\sigma$ in a sparse RBM is a definable parameter — that controls sparseness.
But $\sigma_i$ in a Gaussian-Bernoulli is the variance of the the features $i$.

They are both trained the same way with Contrastive Divergence.

is this right?
are there more connections between them?
Have i got it all wrong?

Best Answer

The models in both papers are Gaussian-Bernoulli RBMs. The difference is the sparse variant includes a term in the object which penalizes hidden units whose conditional expectation deviates from a fixed constant, $p$. See equation (4) in section 3.1 of the sparse RBM paper you link.

Related Question