Cross-Entropy – Understanding the Difference in Log Base for Cross Entropy Calculation in TensorFlow

cross entropytensorflow

I have started going through the TensorFlow tutorials here and I have a small question about the cross entropy calculations. Most places I have seen cross entropy calculated using base 2 log, but the tf.log is base e. Under what conditions would using one log base be preferred over the other?

Best Answer

log base e and log base 2 are only a constant factor off from each other:

$$\frac{\log_e{n}}{\log_2{n}} = \frac{\log_e{2}}{\log_e e} = \log_e 2$$

Therefore using one over the other scales the entropy by a constant factor. When using log base 2, the unit of entropy is bits, where as with natural log, the unit is nats.

One isn't better than the other. It's kind of like the difference between using km/hour and m/s.

It is possible that log base 2 is faster to compute than the logarithm. However, in practice, computing cross-entropy is pretty much never the most costly part of the algorithm, so it's not something to be overly concerned with.