What are we looking for when we want to determine which model is most efficient? In my course slide they often discuss which model is most efficient, but I don't really know what they are looking and how they determine efficiency.
Solved – How do we know if using IV model is more efficient than OLS
efficiencyinstrumental-variablesleast squares
Related Solutions
This may be considered... cheating, but the OLS estimator is a MoM estimator. Consider a standard linear regression specification (with $K$ stochastic regressors, so magnitudes are conditional on the regressor matrix), and a sample of size $n$. Denote $s^2$ the OLS estimator of the variance $\sigma^2$ of the error term. It is unbiased so
$$ MSE(s^2) = \operatorname {Var}(s^2) = \frac {2\sigma^4}{n-K} $$
Consider now the MLE of $\sigma^2$. It is
$$\hat \sigma^2_{ML} = \frac {n-K}{n}s^2$$ Is it biased. Its MSE is
$$MSE (\hat \sigma^2_{ML}) = \operatorname {Var}(\hat \sigma^2_{ML}) + \Big[E(\hat \sigma^2_{ML})-\sigma^2\Big]^2$$ Expressing the MLE in terms of the OLS and using the expression for the OLS estimator variance we obtain
$$MSE (\hat \sigma^2_{ML}) = \left(\frac {n-K}{n}\right)^2\frac {2\sigma^4}{n-K} + \left(\frac {K}{n}\right)^2\sigma^4$$ $$\Rightarrow MSE (\hat \sigma^2_{ML}) = \frac {2(n-K)+K^2}{n^2}\sigma^4$$
We want the conditions (if they exist) under which
$$MSE (\hat \sigma^2_{ML}) > MSE (s^2) \Rightarrow \frac {2(n-K)+K^2}{n^2} > \frac {2}{n-K}$$
$$\Rightarrow 2(n-K)^2+K^2(n-K)> 2n^2$$ $$ 2n^2 -4nK + 2K^2 +nK^2 - K^3 > 2n^2 $$ Simplifying we obtain $$ -4n + 2K +nK - K^2 > 0 \Rightarrow K^2 - (n+2)K + 4n < 0 $$ Is it feasible for this quadratic in $K$ to obtain negative values? We need its discriminant to be positive. We have $$\Delta_K = (n+2)^2 -16n = n^2 + 4n + 4 - 16n = n^2 -12n + 4$$ which is another quadratic, in $n$ this time. This discriminant is $$\Delta_n = 12^2 - 4^2 = 8\cdot 16$$ so $$n_1,n_2 = \frac {12\pm \sqrt{8\cdot 16}}{2} = 6 \pm 4\sqrt2 \Rightarrow n_1,n_2 = \{1, 12\}$$ to take into account the fact that $n$ is an integer. If $n$ is inside this interval we have that $\Delta_K <0$ and the quadratic in $K$ takes always positive values, so we cannot obtain the required inequality. So: we need a sample size larger than 12.
Given this the roots for $K$-quadratic are
$$K_1, K_2 = \frac {(n+2)\pm \sqrt{n^2 -12n + 4}}{2} = \frac n2 +1 \pm \sqrt{\left(\frac n2\right)^2 +1 -3n}$$
Overall : for sample size $n>12$ and number of regressors $K$ such that $\lceil K_1\rceil <K<\lfloor K_2\rfloor $ we have $$MSE (\hat \sigma^2_{ML}) > MSE (s^2)$$ For example, if $n=50$ then one finds that the number of regressors must be $5<K<47$ for the inequality to hold. It is interesting that for small numbers of regressors the MLE is better in MSE sense.
ADDENDUM
The equation for the roots of the $K$-quadratic can be written
$$K_1, K_2 = \left(\frac n2 +1\right) \pm \sqrt{\left(\frac n2 +1\right)^2 -4n}$$ which by a quick look I think implies that the lower root will always be $5$ (taking into account the "integer-value" restriction) -so MLE will be MSE-efficient when regressors are up to $5$ for any (finite) sample size.
Mathematical Statistics – Exploring the Concept of Efficiency in Estimation and Statistical Analysis
Efficiency is a "per se" concept in the sense that it is a measure of how variable (and biased) the estimator is from the "true" parameter. There is an actual numeric value for efficiency associated with a given estimator at a given sample-size for a given loss function. This actual number is related to the estimator AND the sample-size AND the loss function.
Asymptotic efficiency looks at how efficient the estimator is as the sample size increases. More important is how rapidly the estimator becomes efficient but this can be more difficult to determine.
Relative efficiency looks at how efficient the estimator is relative to an alternative estimator (typically at a GIVEN sample-size).
Efficiency requires the specification of some loss function. Originally, this was variance when only unbiased estimators were considered. These days, this is most often MSE (mean-squared-error which accounts for bias and variability). Other loss-functions can be used. The classical Cramer-Rao bound was for unbiased estimators only but was extended to many of these other loss functions (most especially for MSE loss).
An important adjunct concept is admissibility and domination of estimators.
The Wikipedia entry has many links.
Best Answer
Comparing the asymptotic efficiency of the OLS and the IV estimator makes sense only if the OLS estimator is also consistent. In that case, OLS is efficient by virtue of the Gauss-Markov Theorem, and IV is not efficient. This also makes sense intuitively as the IV estimator uses only correlation between the instrument and the endogenous (which is actually exogenous if OLS is consistent) variable to estimate its effect. OLS, on the other hand, uses the whole correlation, effectively instrumenting the independent variable with itself.
In finite samples, the mean squared error of the biased OLS estimator of an endogenous variable might, however, actually be smaller than the mean squared error of a correctly specified IV estimator. This is because of the efficiency loss (as described above) and because of the finite-sample bias of the IV estimator. Typical causes of this are many or weak instruments or both. However, it is generally hard to assess how well the OLS estimator is doing in comparison. One approach is to discard an IV estimator when the F-statistic from the first stage is less than 10.