Solved – Restricted Boltzmann Machines for regression

classificationmachine learningneural networksregression

I'm following up on the question I'd asked earlier on RBMs. I see a lot of literature describing them but none that actually talks of regression (not even classification with labelled data). I get a feeling that it is used for unlabelled data only. Are there any resources for handling regression? Or is it as simple as adding another layer on top of the hidden layer and run the CD algorithm up and down? Thanks much in advance.

Best Answer

You are right about unlabeled data. RBMs are generative models and most commonly used as unsupervised learners.

When used for constructing a Deep Belief Network the most typical procedure is to simply train each each new RBM one at a time as they are stacked on top of each other. So contrastive divergence isn't going up and down in the sense that I think you mean. It is only working with one RBM at a time, using the hidden layer of the previous topmost RBM as the input for the new topmost RBM. After all this you can either treat the stack of RBM weights as the initial weights for a standard feed forward neural network and train using your labeled data and backpropagation or do something more exotic like use the wake-sleep algorithm. Notice that we haven't used any labeled data up until this last step, that is one of the benefits of these types of models. We can learn a good generative model using lots of unlabeled data and even if our ultimate goal is to have good discriminative performance, it should help.

On the other hand, there are several ways you can use RBMs for classification.

  • Train an RBM or a stack of several RBMs. Use the topmost hidden layer as input to some other supervised learner.
  • Train an RBM for each class and use the unnormalized energies as input to a discriminative classifier.
  • Train the RBM to be a joint density model of P(X, Y). Then given some input x, just pick the class y which minimizes the energy function (normalization isn't a problem here like in the above since the constant Z is the same for all classes).
  • Train a discriminative RBM

I would highly suggest you read through the technical report A Practical Guide to Training Restricted Boltzmann Machines by Geoff Hinton. It discusses several of these issues in much greater detail, provides invaluable tips, cites lots of relevant papers, and may help to clear up any other confusion you might have.

Related Question