Regression – Understanding Linear Regression Cost Function


I'm looking at plain linear regression was wondering about the specifics of the cost function.

The cost function associated with simple linear regression is given by:

$$J(\theta) = \frac{1}{2n}\sum_{1=1}^n(y_i – \theta^tx_i)^2$$

Where does the ($\tfrac{1}{2n}$) term come from? Why not just $\tfrac{1}{n}$ so we achieve the average?

Best Answer

Linear regression minimizes squared error

$$ J(\theta) = \sum_{i=1}^n (y_i - \theta^T x_i)^2 $$

You might want to put $1/n$ in front, so that it's units don't depend on sample size $n$. The $1/2$ comes from the derivative

$$ \frac{d}{dx} \big(x^2\big) = 2x $$

and with

$$ \frac{d}{dx} \Big( \frac{1}{2} x^2 \Big) = x $$

so putting $1/2$ in front makes writing the derivatives simpler because you don't need to add $2$ in front. $1/(2n)$ does both.

No matter which form you choose (sum of squared errors, $1/n$, $1/2$, $1/(2n)$), they have the same minimum, because multiplying function by a positive constant does not change its minimum, so they are equivalent.

Related Question