[Math] Understanding the least squares regression formula

least squareslinear regressionstatistics

enter image description here

I've seen the following tutorial on it, but the formula itself had not been explained (https://www.youtube.com/watch?v=Qa2APhWjQPc).

I understanding the intuition behind finding a line that "best fits" the data set where the error is minimised (image below).

enter image description here

However, I don't see how the formula relates to the intuition? If anyone could explain the formula, as I can't visualise what it's trying to achieve. A simple gradient is the dy/dx, would't we just do $\sum(Y – y) \ รท \sum (X – x)$ where Y and X are the centroid values (average values). By my logic, that would be how you calculate the average gradient? Could someone explain this to me?

Best Answer

Our cost function is:

$J(m,c) = \sum (mx_i +c -y_i)^2 $

To minimize it we equate the gradient to zero:

\begin{equation*} \frac{\partial J}{\partial m}=\sum 2x_i(mx_i +c -y_i)=0 \end{equation*}

\begin{equation*} \frac{\partial J}{\partial c}=\sum 2(mx_i +c -y_i)=0 \end{equation*}

Now we should solve for $c$ and $m$. Lets find $c$ from the second equation above:

\begin{equation*} \sum 2(mx_i +c -y_i)=0 \end{equation*}

\begin{equation*} \sum (mx_i +c -y_i)=cN+\sum(mx_i - y_i)=0 \end{equation*}

\begin{equation*} c = \frac{1}{N}\sum(y_i - mx_i)=\frac{1}{N}\sum y_i-m\frac{1}{N}\sum x_i=\bar{y}-m\bar{x} \end{equation*}

Now substitude the value of $c$ in the first equation:

\begin{equation*} \sum 2x_i(mx_i+c-y_i)=0 \end{equation*}

\begin{equation*} \sum x_i(mx_i+c-y_i) = \sum x_i(mx_i+ \bar{y}-m\bar{x} + y_i)= m\sum x_i(x_i-\bar{x}) - \sum x_i(y_i-\bar{y})=0 \end{equation*}

\begin{equation*} m = \frac{\sum x_i(y_i-\bar{y})}{\sum x_i(x_i-\bar{x})} =\frac{\sum (x_i-\bar{x} + \bar{x})(y_i-\bar{y})}{\sum (x_i-\bar{x} + \bar{x})(x_i-\bar{x})} =\frac{\sum (x_i-\bar{x})(y_i-\bar{y}) + \sum \bar{x}(y_i-\bar{y})}{\sum (x_i-\bar{x})^2 + \sum(\bar{x})(x_i-\bar{x})} = \frac{\sum (x_i-\bar{x})(y_i-\bar{y}) + N (\frac{1}{N}\sum \bar{x}(y_i-\bar{y}))}{\sum (x_i-\bar{x})^2 + N (\frac{1}{N}\sum(\bar{x})(x_i-\bar{x}))} = \frac{\sum (x_i-\bar{x})(y_i-\bar{y}) + N (\bar{x} \frac{1}{N} \sum y_i- \frac{1}{N} N \bar{x} \bar{y})}{\sum (x_i-\bar{x})^2 + N (\bar{x}\frac{1}{N} \sum x_i - \frac{1}{N} N (\bar{x})^2))} = \frac{\sum (x_i-\bar{x})(y_i-\bar{y}) + 0}{\sum (x_i-\bar{x})^2 + 0} \end{equation*}

Related Question