Derivation of the least square estimator for multiple linear regression

linear algebralinear regressionmatrix equationsoptimization

I find a derivation of the least square estimator for multiple linear regression, but there some part I am not fully understand some part in the. The derivation is following:

Starting from $y= Xb +\epsilon $, which really is just the same as

$\begin{bmatrix}
y_{1} \\
y_{2} \\
\vdots \\
y_{N}
\end{bmatrix}
=
\begin{bmatrix}
x_{11} & x_{12} & \cdots & x_{1K} \\
x_{21} & x_{22} & \cdots & x_{2K} \\
\vdots & \ddots & \ddots & \vdots \\
x_{N1} & x_{N2} & \cdots & x_{NK}
\end{bmatrix}
*
\begin{bmatrix}
b_{1} \\
b_{2} \\
\vdots \\
b_{K}
\end{bmatrix}
+
\begin{bmatrix}
\epsilon_{1} \\
\epsilon_{2} \\
\vdots \\
\epsilon_{N}
\end{bmatrix} $

let $e = y – Xb$, then it all comes down to minimzing $e'e$:

$e'e = \begin{bmatrix}
e_{1} & e_{2} & \cdots & e_{N} \\
\end{bmatrix}
\begin{bmatrix}
e_{1} \\
e_{2} \\
\vdots \\
e_{N}
\end{bmatrix} = \sum_{i=1}^{N}e_{i}^{2}
$

So minimizing $e'e$ gives us:

$min_{b}$ $e'e = (y-Xb)'(y-Xb)$

$min_{b}$ $e'e = y'y – 2b'X'y + b'X'Xb$

$\frac{\partial(e'e)}{\partial b} = -2X'y + 2X'Xb \stackrel{!}{=} 0$

$X'Xb=X'y$

$b=(X'X)^{-1}X'y$

my problem is about this part "$min_{b}$ $e'e = (y-Xb)'(y-Xb)$", I believe the ' notation at here is the transpose, but I could not get the result same as above, if I just open the bracket as usual
\begin{align*}
min_{b}e'e &= (y-Xb)'(y-Xb)\\
& = y'y – y'Xb – b'X'y + b'X'Xb
\end{align*}
which is not same as what the derivation got "$min_{b}$ $e'e = y'y – 2b'X'y + b'X'Xb$", could anyone explain to me how could the derivation get this matrix equation?

Best Answer

This is because $$b'X'y = (y'Xb)'$$ combined with the fact that they are both scalar. A transposed scalar equals itself.

Related Question