Logistic Regression – The Correct Loss Function for Logistic Regression

logisticloss-functionsmachine learning

I've been using logistic regression for a specific problem and the loss function the paper used is the following :
$$ L(Y,\hat{Y})=\sum_{i=1}^{N} \log(1+\exp(-y_i\hat{y}_{i}))$$
Yesterday, I came accross Andrew Ng's course (Stanford notes) and he gave another loss function that was intuitive, according to his saying. The function was :
$$J(\theta)=\frac{−1}{N}\sum_{i=1}^{N}y^{(i)}\log(h_\theta(x^{(i)}))+(1−y^{(i)})\log(1−h_\theta(x^{(i)}))$$
Now I know there isn't only ONE loss function per model and that both could be used.

My question is more about what separates those two functions ? Is there any advantage of working with one instead of the other ? Are they equivalent in any way ?
thanks !

Best Answer

With the sigmoid function in logistic regression, these two loss functions are totally same, the main difference is that

  • $y_i\in\{-1,1\}$ is used in first loss function;
  • $y_i\in\{0,1\}$ is used in the second loss function.

Two loss functions can be derived by maximizing likelihood function.

Related Question