Solved – Covariance estimate is zero for random compoment

covariance-matrixmixed modelrandom-effects-modelsas

My dataset consists of 4 variable: values, dil, exp and sample. There are 170 values obtained for variable values; there are 5 levels for the variable dil, 10 levels for the variable exp and 5 levels for the variable sample.

I fitted a mixed model in SAS to find the variance component estimates:

proc mixed;
by sample;
model values = dil;
random exp;
run;

So here dil is fixed and experiment is random.

When I fit the model, the covariance estimate for for exp is zero. It says that
"Convergence criteria met but final hessian is not positive definite".
Can anyone give some help on this? Thank you!

Also there are a few na values in the dataset.

Best Answer

By covariance estimate do you mean the error term for experiment is 0? I wouldn't call this a covariance estimate.

This is a common problem with mixed models. The most common explanation of a zero error estimate is that there are too many levels of the random random effect and not enough fixed observations within each random level to be able to calculate the error term.

I think you should look into fitting your data into one model instead of 'by sample'. I don't know enough about your design. Is 'sample' repeated measurements on the same observation? If not you can probably just remove the 'by sample' statement, else you need to modify the code to introduce correlation among the repeated effects.

If you post your data with more explanation we may be able to determine the correct model. However in some situations, when the experimental design was poor, you may not be able to get around this.

Related Solutions

Solved – Linear mixed model, negative information criteria values and Hessian matrix not positive definite

The negative sign matters, so -642 is smaller than -497; by that criteria the unstructured model is better. (Nice job paying attention to the sign, by the way; many people make the mistake of mentally taking the absolute value, which is not correct.)

The second question is more difficult to answer; in rough terms, this probably means that the optimization is at a saddle point instead of a true maximum, but exactly what to do about it and how much it matters is unclear. It's probably a good idea to try to simplify your model to reduce the computational complexity and see if you can get it to go away; if the results give you the same qualitative answer, that's probably a good thing. If not, it could mean that the warning really mattered, or, it could just mean you've simplified it too far. I'll be curious what others recommend.

PROC Mixed vs lme/lmer in R – Differences in Degrees of Freedom

For the first question, the default method in SAS to find the df is not very smart; it looks for terms in the random effect that syntactically include the fixed effect, and uses that. In this case, since trt is not found in ind, it's not doing the right thing. I've never tried BETWITHIN and don't know the details, but either the Satterthwaite option (satterth) or using ind*trt as the random effect give correct results.

PROC MIXED data=Data;
    CLASS ind fac trt;
    MODEL y = trt /s ddfm=satterth;
    RANDOM ind /s;
run;

PROC MIXED data=Data;
    CLASS ind fac trt;
    MODEL y = trt /s;
    RANDOM ind*trt /s;
run;

As for the second question, your SAS code doesn't quite match your R code; it only has a term for fac*ind, while the R code has a term for both ind and fac*ind. (See the Variance Components output to see this.) Adding this gives the same SE for trt in all models in both Q1 and Q2 (0.1892).

As you note, this is an odd model to fit as the fac*ind term has one observation for each level, so is equivalent to the error term. This is reflected in the SAS output, where the fac*ind term has zero variance. This is also what the error message from lme4 is telling you; the reason for the error is that you most likely misspecified something as you're including the error term in the model in two different ways. Interestingly, there is one slight difference in the nlme model; it's somehow finding a variance term for the fac*ind term in addition to the error term, but you will notice that the sum of these two variances equal the error term from both SAS and nlme without the fac*ind term. However, the SE for trt remains the same (0.1892) as trt is nested in ind, so these lower variance terms don't affect it.

Finally, a general note about degrees of freedom in these models: They are computed after the model is fit, and so differences in the degrees of freedom between different programs or options of a program do not necessarily mean that the model is being fit differently. For that, one must look at the estimates of the parameters, both fixed effect parameters and covariance parameters.

Also, using the t and F approximations with a given number of degrees of freedom is fairly controversial. Not only are there several ways to approximate the df, some believe the practice of doing so is not a good idea anyway. A couple words of advice:

If everything is balanced, compare the results with the traditional least squares method, as they should agree. If it's close to balanced, compute them yourself (assuming balance) so that you can make sure the ones you're using are in the right ballpark.
If you have a large sample size, the degrees of freedom don't matter very much as the distributions get close to a normal and chi-squared.
Check out Doug Bates's methods for inference. His older method is based on MCMC simulation; his newer method is based on profiling the likelihood.

Best Answer

Related Solutions

Solved – Linear mixed model, negative information criteria values and Hessian matrix not positive definite

PROC Mixed vs lme/lmer in R – Differences in Degrees of Freedom

Related Question