Suppose we have a NO INTERCEPT model, $$y_i=\beta x_i+e_i$$
where $e_{i}$ follows a N(0,$\sigma^2 x_i^h$), so $e_i$ is equal in distribution to $e_{0i} x_i^{\frac{h}{2}}$, where $e_{0i}$ follows a N(0,$\sigma^2$) .
I know that in order to obtain the BLUE (Best Linear Unbiased Estimator) of the $\beta$ coefficient, set $w_i = x_i^{-\frac{h}{2}}$.
$h$ can be estimated by calculating the marginal variances at each level of x. Then, using the model $\log(\hat{e}_i^2)=\log(\sigma^2)+h\log(X)$ which yields $\hat{h}$, so $w_i = x_i^{-\frac{\hat{h}}{2}}$ .
This then gives the model $$y_i w_i = \beta x_i w_i + e_{0i}$$ which gives $\hat{\beta }=\frac{\sum_{i=1}^{n}w_i^2 x_i y_i}{\sum_{i=1}^{n}w_i^2 x_i^2}$.
I am looking to find $Var(\hat{\beta })=\frac{\sum_{i=1}^{n}w_i^2 x_i^2 }{(\sum_{i=1}^{n}w_i^2 x_i^2)^2}Var(w_i y_i)=\sigma^2 \frac{\sum_{i=1}^{n}w_i^2 x_i^2 }{(\sum_{i=1}^{n}w_i^2 x_i^2)^2}$, but I'm not completely sure how to correctly estimate $\sigma^2$.
I am estimating $\sigma^2$ using the usual $s^2$ formula with the transformed data, $\frac{1}{n-1}\sum_{i=1}^{n}(w_i y_i -\hat{\beta}w_i x_i)^2$. Is this correct? Should I be dividing by n-2 because I've estimated TWO parameters ($\beta$ and $h$)?
Best Answer
First off I assume you mean $e_i \sim N(0,\sigma^2|x_i|^h)$ or that $x_i$ is always positive. Otherwise you will get negative and or complex valued variances.
In practice, when we know heteroskedasticity exists but are unaware of its form we may use the two-step method of Feasible generalized least squares (FGLS). However FGLS may be inconsistent. So to @whuber 's point, given you know something about the heteroskedasticity structure, you can construct a consistent maximum likelihood estimator (MLE) for your parameters. The likelihood function would have the form $$ L=\prod_{i=1}^{n} \frac{1}{\sigma |x_i|^{h/2}}\phi \bigg(\frac{y_i-x_i'\beta}{\sigma |x_i|^{h/2}}\bigg) $$ Where $\phi$ is the standard normal pdf.
Your optimal parameter estimates may then be found by maximizing the log likelihood i.e $$ (\hat \beta,\hat \sigma, \hat h)=\arg \max_{\beta,\sigma,h}\bigg\{\sum_{i=1}^{n}-\ln(\sigma |x_i|^{h/2})+\ln\bigg(\phi \bigg(\frac{y_i-x_i'\beta}{\sigma |x_i|^{h/2}}\bigg)\bigg) \bigg\} $$ Which simplifies too $$ (\hat \beta,\hat \sigma, \hat h)=\arg \max_{\beta,\sigma,h}\bigg\{-\frac{n}{2}\ln(2\pi)-\sum_{i=1}^{n}\ln\bigg(\sigma |x_i|^{h/2}\bigg)+\frac{(y_i-x_i'\beta)^2}{\sigma^2|x_i|^h}\bigg\} $$
This is extremely useful as you can construct a hypothesis test for homoskedasticity. $$ H_0: h=0 $$ $$ H_a: h\neq 0 $$ Where the null is homoskedasticity.
$Var(\hat \beta)$ may be estimated using the fisher information equality, the negative inverse of the hessian of the above log likelihood evaluated at $(\hat \beta, \hat \sigma, \hat h)$. These estimates may be obtained using built in optimization and maximum likelihood routines for R, Matlab, Python, etc. Below is an R example.