Solved – Cross-entropy for comparing images

cross entropyentropyimage processingmachine learning

Suppose we have two greyscale images which are flattened to 1d arrays: $y=(y_1, y_2, \ldots, y_n)$ and $\hat{y} = (\hat{y}_1, \hat{y}_2, \ldots, \hat{y}_n)$ with pixel values in $[0,1]$. How exactly do we use cross-entropy to compare these images?

The definition of cross entropy leads me to believe that we should compute $$-\sum_{i} y_i \log \hat{y}_i,$$ but in the machine learning context I usually see loss functions using "binary" cross entropy, which I believe is $$ -\sum_i y_i \log \hat{y}_i – \sum_i (1-y_i) \log (1-\hat{y}_i).$$

Can someone please clarify this for me?

Best Answer

The cross-entropy between a single label and prediction would be

$$L = -\sum_{c \in C} y_{c} \log \hat y_{c}$$

where $C$ is the set of all classes. This is the first expression in your post. However, we need to sum over all pixels in an image to apply this:

$$L = -\sum_{i \in I} \sum_{c \in C} y_{i,c} \log \hat y_{i,c}$$

where $I$ is the set of pixels in an image and with $y_{i,c}$ being an indicator variable for whether the $i$th pixel is in class $c$.

In the binary case, we only have two classes: $0$ and $1$.

$$L = -\sum_{i \in I} \left( y_{i,0} \log \hat y_{i,0} + y_{i,1} \log \hat y_{i,1} \right)$$

Since $y_{i,0} + y_{i,1}$ must necessarily sum to 1, we can also just drop the class indices and denote $y_i = y_{i,0}$ and $1-y_i = y_{i,1}$.

$$L = -\sum_{i \in I} \left( y_i \log \hat{y_i} + (1-y_i) \log (1-\hat y_i) \right)$$

This is where the second equation in your post comes from.