Solved – How to do a one vs all classification (binary classifier) with a neural network

classificationdeep learningimage processingmachine learningneural networks

I have a set of images that belong to a particular class. Then, I have another set of images that do not contain any image of the above particular class.

So, effectively I have two sets of images viz. one that belongs to a class and another that does not belong to the class.

I need to design a neural network that can model the positive class. What I am doing is the following using Theano in python:

A multiple layers neural network with convolutional layers at the beginning.
Use a mean squared error as loss function.
Use a single sigmoid neuron at the output.

My question is whether using a single sigmoid neuron is the right approach or not. Otherwise, should I use two separate sigmoid neurons at the output for this scenario.

Any recommendations is appreciated.

Thanks

Best Answer

Easy question: Yes you are correct. A single sigmoid neuron output is exactly what you want for this. Could also use any other 0,1 bounded neuron: eg tanh.

Like your choice of sigmoid vs tanh, your loss (ie error, ie cost) function doens't really matter. I'ld use cross-entropy, but mean squared world work fine -- any differences are going to vanish over a few epochs of training.

I like to initialise with small normally distributed values, generaly mean zero, varience 0.01 (Its often with testing dropping or raising that varience by an order of magnitude). But once again, small uniformly distributed also works fine.

A futhur recommendation is that you blanance the number of positive and negative examples. Otherwise your network will learn the prior (that things are for example 2x more likely to have as not to have.).

You should take a good look through Yann Lecun's "Effient Back-propergation". It's written by the creator of the convoltional neural network.

Best Answer

Related Solutions

Solved – neural network output layer for binary classification

Neural Networks – Nonlinearity Before Softmax Layer in Convolutional Neural Networks

Related Question