Solved – LogLoss in neural networks – Binary

conv-neural-networkcross entropyentropyloss-functionsneural networks

I recently presented my M.D work and one question got me into trouble.

Context:

In my neural network classification, I use 1 single output sigmoid neuron.

My problem requires that i classify the input into two groups : 0 or 1.

For the loss function, I use lasagne.objectives.binary_crossentropy :

Computes the binary cross-entropy between predictions and targets.

L= $−tlog(p)− (1−t) log(1−p)$

Question made in my presentation by a teacher

"You cant be using binary cross-entropy because entropy is just used when your data either has noises or some probability, why and how do you use that loss function? You cant use it because for each input the model already classify it to either 1 group, so 'entropy' makes no sense".

Im confused, especially by the last part, because my model has a sigmoid output that ranges from 0 to 1. I know that the output from the sigmoid neuron isnt exactly a probability, However his question doesnt make sense to me.

Would someone help me come up with an answer for that?

Sources:

http://lasagne.readthedocs.io/en/latest/modules/objectives.html#loss-functions

Best Answer

Your teacher needs to open a basic book on machine learning. The cross entropy loss function is equivalent to minimizing the KL divergence between your empirical and predicted distributions. While you can think of your predicted distribution as a probability, it's much more straightforward to think of it as a "confidence" in your prediction. If your teacher wants you to map your predicted distribution to 0s and 1s, then you would use ROC analysis to do so, since there is no universal way of mapping continuous values to discrete ones.