I've been looking over some regression models lately and I came across one which, although similar, differs from the "standard" simple linear model. I was hoping somebody could provide some assistance with some properties that I'm confused with.
Assuming the regression form:
$y_{i} = \beta_{0} + \beta_{1}(x_{i}-\bar{x}) + \epsilon_{i}$
with expected value:
${\bf E}[y_{i}] = \hat{\beta}_{0} + \hat{\beta}_{1}(x_{i}-\bar{x})$
where $\hat{\beta}_{0} = \bar{y}$ and $\hat{\beta}_{1} = \frac{S_{XY}}{S_{XX}}$
and, from what I've worked out:
${\bf E}[\hat{\beta}_{0}] = \beta_{0}$, ${\bf E}[\hat{\beta}_{1}] = \beta_{1}$
and:
$\text{Var}(y_{i}) = \sigma^{2}$, $\text{Var}(\hat{\beta}_{0}) = \frac{\sigma^{2}}{n^{2}}$, $\text{Var}(\hat{\beta}_{1}) = \frac{\sigma^{2}}{S_{XX}}$
How can it be shown that:
(a)
$\text{Cov}(y_{i}, \hat{\beta}_{1}) = \frac{\sigma^{2}(x_{i}-\bar{x})}{\sum (x_{i}-\bar{x})^{2}}$
I know that the covariance formula is given by:
$\text{Cov}(y_{i}, \hat{\beta_{1}}) = {\bf E}[(y_{i} – {\bf E}[y_{i}])(\hat{\beta_{1}} – {\bf E}[\hat{\beta_{1}}])]$
I'm guessing that to yield this result, the covariance formula somehow becomes of the form:
$\text{Cov}(y_{i}, \hat{\beta_{1}}) = (x_{i}-\bar{x})\text{Var}(\hat{\beta}_{1})$
since this would give:
$\text{Cov}(y_{i}, \hat{\beta}_{1}) = \sigma^{2} \frac{(x_{i}-\bar{x})}{\sum (x_{i}-\bar{x})^{2}} = \frac{\sigma^{2}(x_{i}-\bar{x})}{\sum (x_{i}-\bar{x})^{2}}$
However, although, I have tried to do this, I'm confused about how to manipulate this formula to yield the desired result.
(b)
$\text{Corr}(\hat{\beta}_{0}, \hat{\beta}_{1}) = 0$
Here, I know that if it can be shown that:
$\text{Cov}(\hat{\beta_{0}}, \hat{\beta}_{1}) = 0$
it follows that:
$\text{Corr}(\hat{\beta}_{0}, \hat{\beta}_{1}) = 0$
However, as in part (a), I'm confused about how to develop the covariance formula accordingly.
Best Answer
Here are some hints:
For (a) Try substituting $y_i = \hat{\beta}_0 + \hat{\beta}_1(x_i - \bar{x}) + \epsilon_i$. So you get: \begin{equation} Cov(y_i,\hat{\beta}_1) = Cov(\hat{\beta}_0,\hat{\beta}_1) + (x_i - \bar{x})Cov(\hat{\beta}_1,\hat{\beta}_1) + Cov(\epsilon_i,\hat{\beta}_1). \end{equation} To get this to what you want, you need to figure out part (b) to show $Cov(\hat{\beta}_0,\hat{\beta}_1) = 0$. To do this, remember that asymptotically: \begin{equation} (\hat{\beta}_0,\hat{\beta}_1)^T \sim MVN_2 ( \boldsymbol{\beta}_0, I_E(\boldsymbol{\beta}_0)^{-1}), \end{equation} where $\boldsymbol{\beta}_0 = (\beta_0,\beta_1)^T$. Here $I_E(\boldsymbol{\beta}_0)$ denotes the expected Fisher information matrix, given by the expectation of the negative second derivative of the log-likelihood (hopefully you've seen this before, if not look here: http://en.wikipedia.org/wiki/Fisher_information). If you look at the off-diagonals of $I_E^{-1}$ (or equivalently $I_E$ here), you should be able to find the covariance between the two parameters, which will mean you can answer both (a) and (b). (By the way $I_E(\boldsymbol{\beta}_0) \approx I_E(\boldsymbol{\hat{\beta}})$ provided you have a decent sample size).
Hope it helps.