I am studying the factors influencing the annual salary for employees at a undisclosed bank. The regression model that I have decided to employ is as follows:
\begin{equation}
Y_{k}=\beta_{1}+\beta_{2}E_{k}+\beta_{3}D_{gk}+\beta_{4}D_{mk}+\beta_{5}D_{2k}+\beta_{6}D_{3k}+\varepsilon_{k}
\end{equation}
where $Y_{k}$ is the logarithm of annual salary, $E$ is the number of years of education, $D_{g}$ is a gender dummy, $D_{m}$ is a minority dummy, and where
\begin{equation}
D_{2}=\begin{cases} 1 &\text{Custodial job} \\ 0 & \text{Otherwise} \end{cases}
\end{equation}
and
\begin{equation}
D_{3}=\begin{cases} 1 &\text{Management job} \\ 0 & \text{Otherwise} \end{cases}
\end{equation}
As you know, whenever one deals with GLS, $\Omega$ will almost surely be unknown and thus have to be estimated. In general there are $\frac{n(n+1)}{2}$ parameters to be estimated, which makes it pretty impossible to come up with a viable estimation out of $n$ observations. This is usually counteracted by imposing some structure on $\Omega$.
In my case, I would like to make the assumption that the disturbance terms $\varepsilon_{k}$ in the above regression model have variance $\sigma_{i}^{2}$ for $i=1,2,3$, according to whether the $i$-th employee has a job in category 1,2, or 3 respectively. Now, we may introduce the transformations $\gamma_{1}=\log (\sigma_{1}^{2}),\gamma_{2}=\log(\sigma_{2}^{2}/\sigma_{1}^{2})$, and $\gamma_{3}=\log(\sigma_{3}^{2}/\sigma_{1}^{2})$ so as to enable us to formulate the following model for
\begin{equation}
\sigma_{k}^{2}= \exp \{ \gamma_{1}+\gamma_{2}D_{2k}+\gamma_{3}D_{3k} \}
\end{equation}
Since $\hat{\beta}_\rm{OLS}$ is a consistent estimate of $\beta$, even under the assumption of heteroscedasticity, we have that $\hat{\beta}_{\rm OLS} \xrightarrow[]{p}\beta$ as the number of observations increase. We may therefore argue that $e_{k}^{2} \approx \sigma_{k}^{2}$, and so we can regress upon information that we already possess.
Summary of procedure
(1) Calculate the OLS estimate.
(2) Calculate the OLS residual $\textbf{e}=\textbf{Y}-\textbf{X}\hat{\beta}$
(3) Calculate the OLS estimate of $\gamma$ from $e_{k}^{2}=f_{\gamma}(Z_{k})+\overline{\varepsilon}_{k}$.
(4) Calculate the FGLS estimate as the GLS estimate with $\hat{\Omega}=\Omega(\hat{\gamma})$ in place of $\Omega$.
What I would like to know is whether or not one can perform this estimation using a known function in R, say
gls
? If the answer is yes, then how exactly should I write to ensure that that my heteroscedasticity assumption is taken into account? Thanks for taking the time! Have a great day!
Best Answer
Estimating Regression Models with Multiplicative Heteroscedasticity
The model that you have described is discussed in Harvey (1976).
Let me rewrite the model $$ \begin{align} \mathbb{E}(Y_i \mid \mathbf{X}_i, \mathbf{Z}_i) &= \mathbf{X}_i'\boldsymbol{\beta} \\ \mathbb{V}(Y_i \mid \mathbf{X}_i, \mathbf{Z}_i) &\equiv \sigma^2_i \\ &= \exp(\mathbf{Z}_i'\boldsymbol{\alpha}) \\ \end{align} $$ Note that it is possible that $\mathbf{X}_i$ and $\mathbf{Z}_i$ have common elements.
Rewriting the conditional mean equation equivalently as $$ \begin{align} Y_i &= \mathbf{X}_i'\boldsymbol{\beta} + \varepsilon_i \\ \mathbb{E}(\varepsilon_i \mid \mathbf{X}_i, \mathbf{Z}_i) &= 0 \end{align} $$
Two-step estimation
The main aim of Harvey (1976) is to provide (efficient) estimates of $\boldsymbol{\alpha}$, rather than to use that estimate to compute a WLS estimator. However, once the parameters $\boldsymbol{\alpha}$ are computed, they can be so used.
The two-step estimator that you have described, follows the procedure:
1. Compute the OLS estimator $\hat{\boldsymbol{\beta}}$, and the OLS residuals $\hat{\varepsilon}_i^2$,
2. Compute the estimates $\hat{\boldsymbol{\alpha}}$, from the regression
$$ \log \hat{\varepsilon}_i^2 = \mathbf{Z}_i'\boldsymbol{\alpha} + \nu_i $$
Maximum likelihood estimation
The MLE is claimed to be up to 60% more efficient than the two-step estimator above. There are some other advantages that are apparent to me including that the MLE would be the pseudo- maximum likelihood under distributional misspecification, and that there is no need to recompute the WLS after computing $\hat{\boldsymbol{\alpha}}$. However, this does not seem to be borne out by my calculations below.
Under conditional normality of the response, the likelihood is very simple to write down and optimize
$$ \begin{align} \log L_i(\boldsymbol{\beta}, \boldsymbol{\alpha}) &= -\frac{1}{2}(\log2 \pi + \mathbf{Z}_i'\boldsymbol{\alpha})\\ &\qquad -\frac{1}{2}\left(\dfrac{(Y_i - \mathbf{X}_i'\boldsymbol{\beta})^2}{\exp(\mathbf{Z}_i'\boldsymbol{\alpha})}\right) \end{align} $$
Two-step estimation: R
To implement this, first let us simulate some heteroskedastic data using the model given, and estimate it using OLS.
Next, we can get the results using the two-step procedure:
Maximum likelihood estimation: R
I am not entirely sure why the ML results are a bit farther off than even the OLS, and I am not ruling out a coding mistake.
But as you can see, the 2-step estimator seems to do better than the OLS estimator.