Expected Value – Percentile Loss Functions Explained in Detail

expected valueloss-functions

The solution to the problem:

$$ \min_{m} \; E[|m-X|] $$

is well known to be the median of $X$, but what does the loss function look like for other percentiles? Ex: the 25th percentile of X is the solution to:

$$ \min_{m} \; E[ L(m,X) ] $$

What is $L$ in this case?

Best Answer

Let $I$ be the indicator function: it is equal to $1$ for true arguments and $0$ otherwise. Pick $0\lt\alpha\lt 1$ and set

$$\Lambda_\alpha(x)=\alpha x\, I(x\ge 0) - (1-\alpha)x\, I(x\lt 0).$$

This figure plots $\Lambda_{1/5}$. It uses an accurate aspect ratio to help you gauge the slopes, which equal $-4/5$ on the left side and $+1/5$ on the right. In this case excursions above $0$ are heavily downweighted compared to excursions below $0$.

This is a natural function to try because it weights values $x$ that exceed $0$ differently than $x$ that are less than $0$. Let's compute the associated loss and then optimize it.

Writing $F$ for the distribution function of $X$ and setting $L_\alpha(m,x) = \Lambda_\alpha(x-m)$, compute

$$\eqalign{ \mathbb{E}_F(L_\alpha(m,X))&=\int_\mathbb{R} \Lambda_\alpha(x-m)dF(x)\\ &=\alpha\int_\mathbb{R} I(x\ge m)(x-m) dF(x) - (1-\alpha)\int_\mathbb{R} (x-m)I(x\lt m) dF(x)\\ &=\alpha\int_m^\infty(x-m)dF(x) - (1-\alpha)\int_{-\infty}^m(x-m) dF(x). }$$

As $m$ varies in this illustration with the Standard Normal distribution $F$, the total probability-weighted area of $\Lambda_{1/5}$ is plotted. (The curve is the graph of $\Lambda_{1/5}(x-m)dF(x)$.) The right-hand plot for $m=0$ most clearly shows the effect of downweighting the positive values, for without this downweighting the plot would be symmetric about the origin. The middle plot shows the optimum, where the total amount of blue ink (representing $\mathbb{E}_F(L_{1/5}(m,X))\ $) is as small as possible.

This function is differentiable and so its extrema can be found by inspecting the critical points. Applying the Chain Rule and the Fundamental Theorem of Calculus to obtain the derivative with respect to $m$ gives

$$\eqalign{ \frac{\partial}{\partial m}\mathbb{E}_F(L_\alpha(m,X))&=\alpha\left(0-\int_m^\infty dF(x)\right) - (1-\alpha)\left(0 - \int_{-\infty}^m dF(x)\right)\\ &= F(m) - \alpha. }$$

For continuous distributions this always has a solution $m$ which, by definition, is any $\alpha$ quantile of $X$. For non-continuous distributions this might not have a solution but there will be at least one $m$ for which $F(x)-\alpha\lt 0$ for all $x\lt m$ and $F(x)-\alpha\ge 0$ for all $x\ge m$: this also (by definition) is an $\alpha$ quantile of $X$.

Finally, because $\alpha\ne 0$ and $\alpha\ne 1$, it is clear that neither $m\to-\infty$ nor $m\to\infty$ will minimize this loss. That exhausts the inspection of the critical points, showing that $\Lambda_\alpha$ fits the bill.

As a special case, $\mathbb{E}_F(2L_{1/2}(m,X)) = \mathbb{E}_F\left(\left|m-x\right|\right)$ is the loss exhibited in the question.

Best Answer

Related Solutions

Loss Functions – Choosing the Best Loss Function for Binary Classification

Probabilistic Classification – Understanding Probabilistic Classification and Loss Functions

Related Question