Regression – How to Use Tensorflow Cross Entropy for Regression?

cross entropyentropyregressiontensorflow

Does cross-entropy cost make sense in the context of regression? (as opposed to classification) If so, could you give a toy example through tensorflow and if not, why not?

I was reading about cross entropy in
Neural Networks and Deep Learning by Michael Nielsen and it seems like something that could naturally be used for regression as well as classification, but I don't understand how you'd apply it efficiently in tensorflow since the loss functions take logits (which I don't really understand either) and they're listed under Classification here

Best Answer

No, it doesn't make sense to use TensorFlow functions like tf.nn.sigmoid_cross_entropy_with_logits for a regression task. In TensorFlow, “cross-entropy” is shorthand (or jargon) for “categorical cross entropy.” Categorical cross entropy is an operation on probabilities. A regression problem attempts to predict continuous outcomes, rather than classifications.

The jargon "cross-entropy" is a little misleading, because there are any number of cross-entropy loss functions; however, it's a convention in machine learning to refer to this particular loss as "cross-entropy" loss.

If we look beyond the TensorFlow functions that you link to, then of course there are any number of possible cross-entropy functions. This is because the general concept of cross-entropy is about the comparison of two probability distributions. Depending on which two probability distributions you wish to compare, you may arrive at a different loss than the typical categorical cross-entropy loss. For example, the cross-entropy of a Gaussian target with some varying mean but fixed diagonal covariance reduces to mean-squared error. The general concept of cross-entropy is outlined in more detail in these questions: