First for your question about the variance-covariance and s.e. relationship: the variance-covariance matrix is a symmetric matrix which contains on the off-diagonal elements the covariances between all your betas in the model. The main diagonal elements contain the variance of each beta. If you take the square root of the main diagonal entries, you get the standard error of your betas.
Now to Hausman.
Since random effects is a matrix weighted average of the within and between variation in your data it is more efficient (i.e. has lower variance) than the fixed effects estimator which only exploits the within variation. If you want to test the difference between both models, you can write the test statistic as
$$H = (\beta_{FE}-\beta_{RE})'[Var(\beta_{FE})-Var(\beta_{RE})]^{-1}(\beta_{FE}-\beta_{RE})$$
Given that RE is more efficient the difference in the variances is positive definite - or at least it should be. If you use different variance estimators in the two regressions then $H$ might as well be negative. Often this is a sign of model miss-specification but this is a tricky discussion as there can be other instances for which the test statistic may be negative. Let's not consider those for the moment for simplicity.
If you now increase the sample size, you correctly said that your estimators become more efficient. Consequently $[Var(\beta_{FE})-Var(\beta_{RE})]^{-1}$ becomes smaller. Note that this difference is the denominator of a fraction, so as the denominator becomes smaller the fraction becomes bigger.
Maybe this is more intuitive if we consider the case when you are interested in a single variable (call it $k$) only. In this case the test statistic can be written as
$$H =\frac{(\beta_{FE,k}-\beta_{RE,k})}{\sqrt{[se(\beta_{FE,k})^{2}-se(\beta_{RE,k})^{2}]}}$$
To give a numerical example let's start first with the small sample. Let's say the difference in coefficients is 100 and their standard errors in FE and RE are 10 and 5, respectively:
$$H_{small} =\frac{(100)}{\sqrt{[10^{2}-5^{2}]}} = 11.547$$
Then you increase the sample size and suppose the standard errors reduce by one half:
$$H_{large} =\frac{(100)}{\sqrt{[5^{2}-2.5^{2}]}} = 23.094$$
Now you see how the test statistic becomes larger for a larger sample (as the denominator decreases in size thanks to the smaller standard errors). The intuition for the test statistic in matrix notation is the same.
The choice between FE and RE models depends on the focus of the statistical inference. The FE model is an appropriate specification if we are focusing on a specific set of $N$ individuals (say, $N$ firms or $N$ OECD countries, or $N$ American states) and our inference is restricted to the behavior of this set of individuals. The RE model is an appropriate specifiction if we are drawing $N$ individuals randomly from a large population and are trying to make inferences about that population (see Baltagi, Econometric Analysis of Panel Data, 2008, §§2.2-3). The Hausman test can't say anything about your focus.
The Hausman test is asymptotically equivalent to a standard Wald test for the omission of $\tilde{\mathbf{X}}$, a matrix of deviations from individual means (see Baltagi, 2008, §4.3). In other words, given the model
$$y_{it}=\mathbf{x}_{it}\boldsymbol{\beta}+\mu_i+u_{it}\tag{1}$$
one can split $\mathbf{x}_{it}$:
$$y_{it}=(\bar{\mathbf{x}}_i+\tilde{\mathbf{x}}_{it})'\boldsymbol{\beta}+\mu_i+u_{it}\tag{2}$$
where $\bar{\mathbf{x}}_i$ is the vector of individual time-invariant means for the $i$th individual and $\tilde{\mathbf{x}}_{it}=\mathbf{x}_{it}-\bar{\mathbf{x}}_i$.
Further, one can give separate parameters $\boldsymbol{\beta}_1$ to the individual means and $\boldsymbol{\beta}_2$ to the deviation variables:
$$y_{it}=\bar{\mathbf{x}}_i'\boldsymbol{\beta}_1+\tilde{\mathbf{x}}_{it}'\boldsymbol{\beta}_2+\mu_i+u_{it}\tag{3}$$
$\boldsymbol{\beta}_1$ is a between regression coefficient, while $\boldsymbol{\beta}_1$ is the within (FE) regression coefficient. The Hausman test is based on $\hat{\boldsymbol{\beta}}_{RE}-\hat{\boldsymbol{\beta}}_{FE}$, but can equivalently be based on $\hat{\boldsymbol{\beta}}_1-\hat{\boldsymbol{\beta}}_2$ (see Baltagi, 2008, §4.3).
As to correlation, some variables in $\mathbf{X}$ may be correlated with $\boldsymbol{\mu}$, but $\tilde{\mathbf{x}}_{it}$ is orthogonal to $\mathbf{1}\mu_i$ ($\mathbf{1}$ is a vector of ones) for all $i$. Thus:
- under a FE framework, the time-invariant terms $\bar{\mathbf{x}}_i'\boldsymbol{\beta}_1$ and $\mu_i$ are swept out and one gets unbiased and consistent estimates for $\boldsymbol{\beta}_2$;
- under a RE framework,
- if one estimates models $(1)$ or $(2)$ the implicit assumption $\hat{\boldsymbol{\beta}}_1=\hat{\boldsymbol{\beta}}_2$ doesn't hold and the Hausman test fails;
- if one estimates model $(3)$, then $\hat{\boldsymbol{\beta}}_2$ is unbiased and consistent (it is exactly identical to $\hat{\boldsymbol{\beta}}_{FE}$) and one can ignore or suppress $\bar{\mathbf{x}}_i'\boldsymbol{\beta}_1$, or use $\bar{\mathbf{x}}_i'\boldsymbol{\beta}_1+\mu_i$ to model the random intercept (see, e.g., Snijders and Bosker, Multilevel Analysis, 2012, chap. 4).
In brief, you can estimate a RE model which passes the Hausman test by just splitting your $\mathbf{X}$ matrix into its individual time-invariant means $\bar{\mathbf{X}}$ and the within-individual time-varying deviations $\tilde{\mathbf{X}}$ (see here for a simple example, additional details and references). I'd say that such a coherent approach would be better than an eventual and questionable 'mixture' of consistent and inconsistent estimates.
Best Answer
There are two parts to this question:
Should the asymptotic variance-covariance matrix be positive definite? The answer is yes, although in some situations this can probably be weakened to positive semi-definite. If this asymptotic VCE has negative eigenvalues, then the asymptotic distribution of the test statistic is supported on the negative half-line -- in other words, the test statistic is allowed to take negative values, so none of the $\chi^2$ results could work. With the asymptotic VCE that has non-negative eigenvalues some of which are zero, this problem does not bite you, but then you have another problem of figuring out what the degrees of freedom (= number of strictly positive) eigenvalues is. If you have the spectrum that looks like {4, 1, 0.01, 1e-5}, would the last eigenvalue converge to a valid positive, is it a computer round-off error from zero, or is it just a valid non-zero in the finite sample, but would converge to a zero eigenvalue eventually?
In finite samples, little is guaranteed. Sometimes you will have a positive definite matrix in the middle part of Hausman test, so things will be fine. Sometimes, you can get a non-pd matrix when you subtract two variance estimators; this could be a small sample effect, or this could indicate that your model is not correctly specified, so what you think is an asymptotically efficient estimator may not actually be one.
In linear regression situations, including some of the instrumental variable models, you can push the linear algebra of the relevant matrices far enough to establish that the ultimate matrix is positive definite. The requirement is still there, but you can avoid the guesswork of figuring out what's going on with that matrix.