Solved – Binary classification vs. continuous output with neural networks

classificationcontinuous dataneural networks

Wikipedia says in binary classification:

Tests whose results are of continuous values, such as most blood
values, can artificially be made binary by defining a cutoff value,
with test results being designated as positive or negative depending
on whether the resultant value is higher or lower than the cutoff.

Is there some guidance as to whether this is a desirable thing to do or not? I have data where the output value is continuous in the training set and I'm interested to know how strong the output variable is. Ideally an accurate continuous value would be the best, but I also would be satisfied with binary classification. My layman's assumption is that the binary classification task would be a little simpler. Is there any guidance as to whether to prefer continuous output vs binary classification?

Best Answer

It is a bad idea. It increases both type I and type II error. It also invokes "magical thinking" - that is, that something magical happens at the cutoff value. For example, with newborns, it is common to say babies under 2.5 kg are "low birth weight" and those above 2.5 kg are not. This treats a baby of 2.49 kg as being the same as one of 1.4 kg, but vastly different from a baby of 2.51 kg. Similarly, the 2.51 kg baby is treated just like a baby of 4.5 kg.

It is true that people sometimes need to make "yes/no" decisions based on the output of a statistical model. But the statistical model and its results should be a guide and a tool, not a straitjacket.

Related Solutions

Neural Networks – Binary vs Discrete/Continuous Input in Neural Networks

Whether to convert input variables to binary depends on the input variable. You could think of neural network inputs as representing a kind of "intensity": i.e., larger values of the input variable represent greater intensity of that input variable. After all, assuming the network has only one input, a given hidden node of the network is going to learn some function $f(wx + b)$. where $f$ is the transfer function (e.g. the sigmoid) and $x$ the input variable.

This setup does not make sense for categorical variables. If categories are represented by numbers, it makes no sense to apply the function $f(wx + b)$ to them. E.g. imagine your input variable represents an animal, and sheep=1 and cow=2. It makes no sense to multiply sheep by $w$ and add $b$ to it, nor does it make sense for cow to be always greater in magnitude than sheep. In this case, you should convert the discrete encoding to a binary, 1-of-$k$ encoding.

For real-valued variables, just leave them real-valued (but normalize inputs). E.g. say you have two input variables, one the animal and one the animal's temperature. You'd convert animal to 1-of-$k$, where $k$=number of animals, and you'd leave temperature as-is.

Classification – Rescaling Neural Network Sigmoid Output for Binary Classification Probability

model.predict will output a matrix in which each row is the probability of that input to be in class 1.

If you print it, it should look like this:

[[ 0.7310586 ]
 [ 0.26896983]]

You just need to loop through those values.

for i, predicted in enumerate(predictions):
    if predicted[0] > 0.25:
        print "bigger than 0.25"
        #assign i to class 1
    else:
        print "smaller than 0.25"
        #assign i to class 0

EDIT: It might be worth to play with the weight of the classes. If you weight the 1 class 3 times more, you might get something close to what you want, in a more elegant way.

Here is an example.

Best Answer

Related Solutions

Neural Networks – Binary vs Discrete/Continuous Input in Neural Networks

Classification – Rescaling Neural Network Sigmoid Output for Binary Classification Probability

Related Question