Solved – How to interpret smooth l1 loss

deep learningoptimizationregularization

I was hoping to understand what the smooth $l_1$ loss does, but I'm not able to find any good explanation of online, I know $l_1$ loss calculates the absolute error, but what is the use of smooth $L_1$, any answers would be helpful.

Best Answer

Smooth L1-loss can be interpreted as a combination of L1-loss and L2-loss. It behaves as L1-loss when the absolute value of the argument is high, and it behaves like L2-loss when the absolute value of the argument is close to zero. The equation is:

$L_{1;smooth} = \begin{cases}|x| & \text{if $|x|>\alpha$;} \\ \frac{1}{|\alpha|}x^2 & \text{if $|x| \leq \alpha$}\end{cases}$

$\alpha$ is a hyper-parameter here and is usually taken as 1. $\frac{1}{\alpha}$ appears near $x^2$ term to make it continuous.

Smooth L1-loss combines the advantages of L1-loss (steady gradients for large values of $x$) and L2-loss (less oscillations during updates when $x$ is small).

Another form of smooth L1-loss is Huber loss. They achieve the same thing. Taken from Wikipedia, Huber loss is

$ L_\delta (a) = \begin{cases} \frac{1}{2}{a^2} & \text{for } |a| \le \delta, \\ \delta (|a| - \frac{1}{2}\delta), & \text{otherwise.} \end{cases} $

Related Question