MAE(mean absolute error)

machine learningstatistics

For objects $x_1,…, x_n$ with correct answers $y_1,…,y_n$ from R, construct a constant model $a(x)=c$ for the loss function.
$$MAE=\frac{1}{N}\sum_{i=1}^{n}|y_i-c|$$

As I understand, I need to take a derivativе of $c$ and find minimum, but the equation of derivative doesn't have solutions.
$$(MAE)'= \frac{1}{N}\sum_{i=1}^{n}\frac{-y_i+c}{|y_i-c|}$$
but I did this task by using inequality $$\sum_{i=1}^{n}|y_i-c|\geq|\sum_{i=1}^{n}y_i-n \cdot c|$$ and I get the answer $$c=\frac{\sum_{i=1}^{n}y_i}{n}$$

Does it correct?
Tanks for the help

Best Answer

Relabel your $y_i$ in increasing order: $y_1 \le y_2 \le ... \le y_n$. When $c < y_1$, $$MAE = \frac 1n \sum_{i=1}^n y_i - c$$ and $MAE' = -n$. That means that the $MAE$ is going down as $c$ is increasing.

For $y_k < c < y_{k+1}$, $$MAE = \sum_{i=1}^k c-y_i + \sum_{i=k+1}^n y_i - c$$ and $MAE' = k - (n-k)= 2k - n$. When $2k < n$, this is still negative, meaning that increasing $c$ still decreases $MAE$. But when $n > 2k$, the derivative is positive, at which point the $MAE$ is increasing as $c$ increases.

So where do you think the minimum will be?

Related Question