Solved – parameter identification in the context of OLS

econometricsidentifiabilityleast squaresmethod of momentsregression

Can someone explain what identification means in the context of an OLS model? I have a fair grasp of the derivation using either the method of moments or by minimizing the squares, but am failing to grasp which part of this process corresponds to identification. Also, how does identification differ from estimation of the parameters?

Best Answer

Thank you for all the responses. It has been more than a year since I asked this question and I am now able to provide one answer to the question. The below answer illustrates the issue of identification in the context of treatment evaluation where the parameter of interest is the causal effect of treatment receipt on some outcome of interest. Such evaluation problems arise frequently when you are trying to estimate the efficacy of a drug by comparing health outcomes of those who received treatment against a control group. This problem is also frequently encountered in the social sciences where you might be interested in estimating the treatment effect (causal effect) of some policy intervention (e.g., subsidizing healthcare for a certain group of people) on various outcomes such as income, mortality, etc.

Setup:

For simplicity, let the true underlying process be given by the following linear relationship: $$y=\alpha_0+\alpha_1 t + \bf{Z}\pmb{\beta}+\varepsilon$$ where $y$ is the outcome, $t$ is a binary indicator for receipt of some treatment of interest, and $\bf{Z}$ is a vector of all relevant factors that affect the outcome $y$. Further, let $\varepsilon$ be a normally distributed mean-zero noise term (this noise is assumed to be truly non-deterministic since we have assumed $\bf{Z}$ contains every relevant determinant of $y$). It follows that $$\alpha_1=E[y | t=1, \textbf{Z}]-E[y | t=0, \textbf{Z}]$$ Since $\bf{Z}$ includes all relevant factors that determine the outcome $y$, we interpret $\alpha_1$ as the causal effect of treatment receipt ($t=1$) on the outcome $y$.

Empirical Application:

Suppose we are given data on $y$, $t$, and $\bf{Z'}$, where $\bf{Z'}$ is a subset of $\bf{Z}$. In other words, $\bf{Z'}$ only contains some of the relevant variables that determine the outcome $y$. Let $\bf{\tilde{Z}}$ be the unobserved components of $\bf{Z}$. We need to estimate the causal effect of receiving treatment, i.e., $\alpha_1$, in our empirical exercise. However, given our inability to observe the full vector $\bf{Z}$, the best we can do with OLS is to estimate the following: \begin{align} y&=\hat{\alpha}_0+\hat{\alpha}_1 t + \bf{Z'}\hat{\pmb{\beta}}+\nu \end{align} where the error term $\nu = \varepsilon+\bf{\tilde{Z}}\pmb{\tilde{\beta}}$ absorbs the effect of $\bf{\tilde{Z}}$ and is thus, no longer random noise.

Now notice that \begin{align} E[y | t=1, \textbf{Z}']-E[y | t=0, \textbf{Z}'] &=\hat{\alpha}_1 + \left(E[\nu | t=1, \textbf{Z}']-E[\nu | t=0, \textbf{Z}']\right) \\ & = \hat{\alpha}_1 + \pmb{\tilde{\beta}}\underbrace{\left(E[\bf{\tilde{Z}} | t=1, \textbf{Z}']-E[\bf{\tilde{Z}} | t=0, \textbf{Z}']\right)}_{bias} \\ \implies &\hat{\alpha}_1= \left(E[y | t=1, \textbf{Z}']-E[y | t=0, \textbf{Z}']\right) - \text{bias} \end{align} Therefore, $\hat{\alpha}_1$ captures both the mean difference in outcomes associated with treatment status as well as a bias stemming from heterogeneity in $\bf{\tilde{Z}}$ w.r.t. treatment status.

A simple OLS doesn't allow us to adjust for such biases. Therefore, $\hat{\alpha}_1$ does not provide the causal effect of $t$ on $y$. We have thus, failed to identify our parameter of interest namely, the causal effect of $t$ on $y$. The OLS estimate $\hat{\alpha}_1$ can only be interpreted as an estimate of the correlation between $y$ and $t$ that adjusts for $\bf{Z}'$.

Conclusion:

The above explanation began by defining the parameter of interest as the causal effect of $t$ on $y$. It then illustrated how identification of this parameter can be compromised by omitted variable bias. There are a number of empirical strategies that can be employed to correct for such biases. The most obvious would be to collect data on the missing variables $\bf{\tilde{Z}}$, however, this is unlikely to be feasible. Another option would be to randomize the assignment of $t$ and then assume that $t$ is independent of $\bf{\tilde{Z}}$ by construction. This works well if you have the time and resources to design your own intervention $t$ and collect outcomes data on the subjects of your study. If you are stuck with observational data that cannot be randomized, you will need to look for empirical strategies that allow you to induce exogeneity in the assignment of $t$. Social scientists refer to such approaches as quasi-experimental methods.

Related Question