there is something I don't understand in the PyTorch implementation of Cross Entropy Loss.

As far as I understand, theoretical Cross Entropy Loss is taking log-softmax probabilities and output a real that should be closer to zero as the output is close to the target (https://ml-cheatsheet.readthedocs.io/en/latest/loss_functions.html#cross-entropy for reference)

Yet the following puzzles me:

```
>>> output=torch.tensor([[0.0,1.0,0.0]]) #Activation is only on the correct class
>>> target=torch.tensor([1])
>>> loss=torch.nn.CrossEntropyLoss()
>>> loss(output,target)
tensor(0.5514)
```

From my understanding, `loss(output,target)`

should yield `0.0`

, since this is the textbook example of a 100% confident neural network.

The formula given in https://pytorch.org/docs/stable/nn.html#crossentropyloss does not convince me on how it is strictly equivalent to the theoretical definition of cross entropy loss.

Is this a problem that my loss function is not equal to 0 when my model's outputs are showing 100% confidence?

## Best Answer

The documentation says that this loss function is computed using the logloss of the

softmaxof $x$ (`output`

in your code). For your example, we have $$ \begin{align} -\log\left(\frac{\exp(x_j)}{\sum_i \exp (x_i)}\right)&= -x_j+\log\left(\sum_i \exp(x_i)\right) \\ &= -1 + \log\left( \exp(0) + \exp(1) + \exp(0) \right) \\ &= 0.5514. \end{align} $$To achieve the desired result, you could either have your network output

scoresas described in the documentation, or else use a loss function that works directly with probabilities.