The restricted boltzmann machine is a generative learning model – but it is also unsupervised?
A generative model learns the joint probability P(X,Y)
then uses Bayes theorem to compute the conditional probability P(Y|X)
. However, the RBM is an unsupervised feature extractor. There is no Y
!
How can the RBM be said to be an unsupervised algorithm, but also generative?
Best Answer
The definition of generative model as learning the joint probability $P(X,Y)$ is given in the context of supervised learning.
In a more general setting, the process of learning the joint probability is "generative" because knowing the joint probability allows the generation of new data - in the supervised context, having $P(X,Y)$ gives the possibility to generate new $(x,y)$ pairs.
Now, what does generative mean in the unsupervised learning context? It means sampling. And sampling is something RBM can do very conveniently, because the lack of inter-layer connections makes Gibbs sampling particularly easy.
Leaving the details of Gibbs sampling aside, it is worth to note that in case of RBM we have in fact $P(v,h)$ where $v$ is the visible layer and $h$ is the hidden layer.
Clarification
Sampling is used in different contexts referring to different ideas. I am often somehow sloppy in the use of this term myself.
In the context of the answer, with sampling I mean generating new samples as opposed to sampling from an available set of elements (just picking a bunch of them with/or without replacement, for example).
In order to be able to generate new (x,y) pairs, you need to model the joint distribution and then sample from it (as done here in a more trivial example where the distribution is just a gaussian).