Multiclass Classification – Softmax in Multiclass Classification

deep learningmachine learningneural networksprobabilitysoftmax

How do we associate a class to every output unit in a multilayer neural network architecture? I mean we assign the output to the class with maximum probability, but how do we decide which neuron corresponds to which class?

Best Answer

You can use any one-to-one mapping between output neurons and classes you like. This mapping is defined in the loss function.

For example, say the training set is $\{(x_1, y_1), ..., (x_n, y_n)\}$, where $x_i$ is the input, and $y_i$ is the corresponding class label (represented as an integer from 1 to $k$). The network has $k$ softmax outputs (one to represent the predicted probability of each class). Let $o_j(x)$ denote the value of the $j$th output unit, given input $x$ and network parameters $\theta$. The cross-entropy loss is:

$$L(\theta) = -\frac{1}{n} \sum_{i=1}^n \log o_{y_i}(x_i)$$

Here, the $j$th output unit represents the predicted probability of the $j$th class. This means, for each data point, we look at the true label $y_i$ (which is an integer between 1 and $k$). We then take the value of the corresponding output unit, add the log of this value to our sum, and so on.

Let's say for some reason you wanted to map output units to classes in reverse order. That is, class $j$ corresponds to output unit $k - j + 1$. You'd define the cross entropy loss as:

$$L(\theta) = -\frac{1}{n} \sum_{i=1}^n \log o_{(k - y_i + 1)}(x_i)$$