Solved – Setting bias of output layer for imbalanced datasets

machine learningneural networks

From a blog post from Andrej Karpathy on training neural networks:

Initialize the final layer weights correctly. E.g. if you are regressing some values that have a mean of 50 then initialize the final bias to 50. If you have an imbalanced dataset of a ratio 1:10 of positives:negatives, set the bias on your logits such that your network predicts probability of 0.1 at initialization.

I don't understand the bolded sentence at all. What does predict 0.1 mean? Is it that all the outputs should be 0.1? How would one go about doing that?

Best Answer

He is referring to a problem where you have a one output, a sigmoid neuron. When you initialize the weights of the network you can set bias to approximately $-2.3$.

Why? The last layers bias is going to be dependant only on the statistics of the data, since it doesnt have any connection to the input. Meaning that if you have an imbalanced dataset, where you have 10 times as much negative examples as positive, and the network would consist only of this bias, then it would output on average the number 1/11. Output of 1/11 corresponds to input to sigmoid layer equal to $-log(10)$, which is $-2.3$

And as a comment, the unbolded sentence is bullshit. It's much better to normalize the targets in the data. Don't take someone's word for granted just because they are a big name in the field.