Regression – Understanding Linear Regression Cost Function

loss-functionsregression

I'm looking at plain linear regression was wondering about the specifics of the cost function.

The cost function associated with simple linear regression is given by:

$$J(\theta) = \frac{1}{2n}\sum_{1=1}^n(y_i – \theta^tx_i)^2$$

Where does the ($\tfrac{1}{2n}$) term come from? Why not just $\tfrac{1}{n}$ so we achieve the average?

Best Answer

Linear regression minimizes squared error

$$ J(\theta) = \sum_{i=1}^n (y_i - \theta^T x_i)^2 $$

You might want to put $1/n$ in front, so that it's units don't depend on sample size $n$. The $1/2$ comes from the derivative

$$ \frac{d}{dx} \big(x^2\big) = 2x $$

and with

$$ \frac{d}{dx} \Big( \frac{1}{2} x^2 \Big) = x $$

so putting $1/2$ in front makes writing the derivatives simpler because you don't need to add $2$ in front. $1/(2n)$ does both.

No matter which form you choose (sum of squared errors, $1/n$, $1/2$, $1/(2n)$), they have the same minimum, because multiplying function by a positive constant does not change its minimum, so they are equivalent.

Related Question