Im confused with Least Squares Regression Derivation (Linear Algebra)

least squareslinear algebraMATLABvector-spaces

I am having troubles understanding Least Squares Regression Derivation (Linear Algebra) in order to code it in matlab. As far as I have understood, is that you take the residual value = (yhat – y)^2 . But in order to find yhat = summnation of the data point alpha *function(x). Hence in order to find the alpha which is the parameter, I need to sum the residual values? I don’t understand this part. Also it says there are more data points than basis functions which I don’t get. I am trying to learn from the SB matlab book.

The book also states “from observation, the vector in the range of A,Yˆ, that is closest to Y is the one that can point perpendicularly to Y . Therefore, we want a vector Y − Yˆ that is perpendicular to the vector Yˆ. “What does it mean by observation? Like how is it mathematically true?

Best Answer

You sum the expression involving the residual values. This expression involves the alpha that you want to find the value of.

Taking the derivative of this expression with respect to alpha and setting that to zero gives you an equation that determines alpha.

{in response to OP's question)

Simple case: Fit $ax+b$ to a set of $y$ values.

Residual at $(x_i, y_i)$ is $ax_i+b - y_i$.

Sum of squares of residuals is $D =\sum_{i=1}^n (ax_i+b-y_i)^2 $.

Take derivative with respect to each parameter. Note that $\frac{\partial f^2(g(x))}{\partial x} =2g'(x)f'(g(x)) $.

$D_a =\sum_{i=1}^n 2x_i(ax_i+b-y_i) =2a\sum_{i=1}^n x_i^2+2b\sum_{i=1}^nx_i-2\sum_{i=1}^nx_iy_i $.

$D_b =\sum_{i=1}^n 2(ax_i+b-y_i) =2a\sum_{i=1}^n x_i+2b\sum_{i=1}^n1-2\sum_{i=1}^ny_i $.

Set these equal to zero and solve for $a$ and $b$.

Simpler case: Fit $ax$ to a set of $y$ values. This is a line through the origin.

Residual at $(x_i, y_i)$ is $ax_i - y_i$.

Sum of squares of residuals is $D =\sum_{i=1}^n (ax_i-y_i)^2 $.

Take derivative with respect to the parameter.

$D_a =\sum_{i=1}^n 2x_i(ax_i-y_i) =2a\sum_{i=1}^n x_i^2-2\sum_{i=1}^nx_iy_i $.

Equating this to zero, $a =\dfrac{\sum_{i=1}^nx_iy_i}{\sum_{i=1}^n x_i^2} $

Related Question