Assume that we have Data $D = \{(x_1,y_1), (x_2,y_2),\dots,(x_n,y_n)\}$ where $x_i \in \mathbb{R}^n$ and $y_i \in \mathbb{R}^n$.
$y_i = w^Tx_i+\epsilon_i$
Using OLS, I can estimate the values of $y$ by calculating the vector $w$. How does this estimated value differ from the value estimated using Bayesian linear regression when informative priors are used?
Best Answer
There exists an improper (sometimes called non-informative) prior that with Bayesian analysis will result in the exact same estimates as classical regression. Using a proper/informative prior will give different estimates, with the estimated slope being an "average" of the OLS estimate and the prior information.
The interpretation also differs, the Bayesian analysis results in a posterior density estimation instead of classical confidence intervals and hypothesis tests. The Bayesian approach can also use a likelihood different from the normal distribution which can give more flexibility.