Solved – Getting the posterior for Bayesian linear regression with a flat prior

bayesianregressionself-study

One of my homework problems for an intro Bayesian class is to derive the posterior of $\vec{\beta}$, for a simple linear regression problem. That is, given:

$Y = X\vec{\beta} + \epsilon$ for:

$Y$ is the response vector (nx1) for n many cases

$X$ is the regressor matrix (nxp) for n many cases and p many regressors

$\vec{\beta}$ is the coefficient vector (nx1)

$\epsilon$ is the error error term ~ N($\vec{0}$,$\sigma ^2I)$

The prior P($\vec{\beta}$) $\propto$ 1 (That is, the flat or reference prior).

Using the pdf for a multivar normal distribution, I started here:

$L(\vec{\beta}|X,\sigma^2)\propto 1/\sigma*e^{(-1/2\sigma^2)(\vec{y} – X\vec{\beta})'(\vec{y} – X\vec{\beta})}$

Focusing on the matrix algebra in the exponent:

$(\vec{y}'-\vec{\beta}'X')(\vec{y} – X\vec{\beta})$

$\vec{y}'\vec{y} – \vec{\beta}'X'\vec{y} – \vec{y}'X\vec{\beta} + \vec{\beta}'(X'X)\vec{\beta}$

I'm not sure where to go from here. Instead of completing the square, my professor usually has us get the exponent into the form:

$ax^2 – 2bx + c$ for x is the normally distributed variable, a,b,c do not contain x.

And we are told that a = $variance^{-1}$, and b = $mean/variance$ in order to figure out our kernel.

For this problem, I'm unable to get it into this form, and would appreciate any help in doing so, even if you just point me to the relevant matrix algebra rules. I'm an undergrad, and so deathly afraid of matrices, and working to overcome my fear. I've seen some related questions which gave the answer, but chose to ask for the following reasons: none that I've seen mention a flat prior, and none go over the matrix algebra in sufficient detail.

I'm going to be hanging around here all night, and am more than happy to address questions or concerns.

Thank you for your time and attention.

Best Answer

I'm posting what I now know to be the answer in case anyone ever stumbles upon this.

Notice that the -2 linear terms above are transposes of one another. Further, notice that the -2 linear terms above have dimension 1X1. Consider that the transpose of a scalar is itself. Therefore, we can combine the two terms.

$\beta'X'X\beta-\beta'X'y -\beta'X'y = \beta'X'X\beta-2\beta'X'y$

As I stated in my question, the coefficient of the quadratic term is the inverse of the variance, and the coefficient of the -2 linear term is the mean/variance.

So, variance = $(X'X)^{-1}$

mean = $(X'X)^{-1}X'y$ (We know to multiply on the left because it only makes sense one way dimensionally), as desired.

Related Question