Machine Learning – Understanding Bias Units in a Restricted Boltzmann Machine

biasmachine learningneural networksrestricted-boltzmann-machine

I'm reading up on RBMs and this is not obvious to me. I'm imagining RBMs being used for something like the Netflix Prize (since that was one of the papers I read on it).

So you have a bunch of movies in the visible nodes, and the genre of those movies in the latent variables, but in the energy function, there are bias units for all of the nodes. What do they represent?

Best Answer

Let's say that there are a hundred input nodes that always follow each other, how should the network encode that?

What happens is that some hidden nodes develop strong weights to these correlated inputs. When most of them are on, it's on, which creates a low energy state whereby all of the inputs are on.

Conversely, let's say that a single input node is always on, how should the network encode that?

In that case, no hidden node is enlisted, but the bias term acts a hidden node that is always on. It will reduce the energy of this input node being in the on state. Similarly, if an input node is usually off, the bias can encode that.

Finally, in our first example with the hundred inputs that always follow each other: there is no need to create a feature that turns on when these inputs are all off because the bias will lower the energy of their off state when the feature that detects them all being on — is off.