Solved – Why does the SEM (‘lavaan’) return covariance estimates for all pairwise variable combinations that were not specified in the model

covariancelavaanstructural-equation-modeling

I specified the following model for SEM analysis using the 'lavaan' package in R. I want to specify a covariance between two observed variables (livestock and human occupancy). This is the only residual correlation between two variables that I specify in my model, but my model output spits out the residual correlations between what seems to be every (or almost every) pairwise combination of my observed variables. How can I get it to stop doing that and only give me the one covariance estimate that I want? All code and output is below. Do the warning messages have something to do with this issue? I'll admit I'm not sure what the second warning means, but I do think the model was properly identified re: the first warning. \

>modelall <- '
# regressions 
lion_occ ~ mgmt01 + avgprecip + pctsavanna + riverkm_perkm2 + roadkm_perkm2 +  
distboundarykm + distedgekm + logprey + competitor_occ + human_occ + livest_occ

logprey ~ mgmt01 + riverkm_perkm2 + roadkm_perkm2 + distedgekm + human_occ + livest_occ + 
distboundarykm + avg_FRP + avgprecip + pctsavanna

competitor_occ ~ mgmt01 + avgprecip + pctsavanna + riverkm_perkm2 + roadkm_perkm2 +  
distboundarykm + distedgekm + logprey + human_occ + livest_occ
#residual correlations
livest_occ ~~ human_occ
'
>sem.all <- sem(modelall, data=gridcovar, se="bootstrap", bootstrap=1000)

Warning messages:
1: In lav_model_vcov(lavmodel = lavmodel, lavsamplestats = lavsamplestats,  :
  lavaan WARNING:
    The variance-covariance matrix of the estimated parameters (vcov)
    does not appear to be positive definite! The smallest eigenvalue
    (= -1.808622e-06) is smaller than zero. This may be a symptom that
    the model is not identified.
2: In lavaan::lavaan(model = modelall, data = gridcovar, se = "bootstrap",  :
  lavaan WARNING: not all elements of the gradient are (near) zero;
                  the optimizer may not have found a local solution;
                  use lavInspect(fit, "optim.gradient") to investigate
> summary(sem.all, standardized=TRUE, fit.measures=TRUE)
lavaan 0.6-3 ended normally after 229 iterations

  Optimization method                           NLMINB
  Number of free parameters                         73

  Number of observations                           204

  Estimator                                         ML
  Model Fit Test Statistic                      43.387
  Degrees of freedom                                18
  P-value (Chi-square)                           0.001

Model test baseline model:

  Minimum Function Test Statistic              556.063
  Degrees of freedom                                50
  P-value                                        0.000

User model versus baseline model:

  Comparative Fit Index (CFI)                    0.950
  Tucker-Lewis Index (TLI)                       0.861

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)              -3259.212
  Loglikelihood unrestricted model (H1)      -3237.518

  Number of free parameters                         73
  Akaike (AIC)                                6664.423
  Bayesian (BIC)                              6906.646
  Sample-size adjusted Bayesian (BIC)         6675.360

Root Mean Square Error of Approximation:

  RMSEA                                          0.083
  90 Percent Confidence Interval          0.052  0.115
  P-value RMSEA <= 0.05                          0.042

Standardized Root Mean Square Residual:

  SRMR                                           0.063

Parameter Estimates:

  Standard Errors                            Bootstrap
  Number of requested bootstrap draws             1000
  Number of successful bootstrap draws            1000

Regressions:
                   Estimate     Std.Err    z-value  P(>|z|)   Std.lv     Std.all
  lion_occ ~                                                                    
    mgmt01              -0.008      0.004   -2.143    0.032      -0.008   -0.172
    avgprecip           -0.000      0.000   -1.340    0.180      -0.000   -0.091
    pctsavanna           0.033      0.020    1.652    0.098       0.033    0.124
    riverkm_perkm2       0.000      0.000    1.412    0.158       0.000    0.113
    roadkm_perkm2        0.000      0.001    0.137    0.891       0.000    0.009
    distboundarykm       0.000      0.001    0.825    0.409       0.000    0.069
    distedgekm           0.000      0.000    1.321    0.187       0.000    0.081
    logprey              0.002      0.003    0.703    0.482       0.002    0.040
    competitor_occ       0.044      0.022    2.013    0.044       0.044    0.213
    human_occ            0.139      0.049    2.845    0.004       0.139    0.406
    livest_occ          -0.017      0.069   -0.240    0.811      -0.017   -0.039
  logprey ~                                                                     
    mgmt01               0.063      0.070    0.908    0.364       0.063    0.064
    riverkm_perkm2       0.000      0.000    2.049    0.040       0.000    0.139
    roadkm_perkm2        0.045      0.013    3.442    0.001       0.045    0.222
    distedgekm           0.011      0.004    2.568    0.010       0.011    0.178
    human_occ           -0.445      0.740   -0.601    0.548      -0.445   -0.065
    livest_occ           0.052      0.947    0.055    0.956       0.052    0.006
    distboundarykm       0.001      0.010    0.124    0.901       0.001    0.009
    avg_FRP             -0.011      0.006   -1.683    0.092      -0.011   -0.115
    avgprecip            0.003      0.008    0.355    0.722       0.003    0.028
    pctsavanna          -1.600      0.430   -3.723    0.000      -1.600   -0.295
  competitor_occ ~                                                              
    mgmt01              -0.159      0.013  -11.833    0.000      -0.159   -0.671
    avgprecip            0.004      0.001    3.637    0.000       0.004    0.175
    pctsavanna           0.109      0.074    1.476    0.140       0.109    0.084
    riverkm_perkm2      -0.000      0.000   -0.128    0.898      -0.000   -0.007
    roadkm_perkm2        0.003      0.003    1.018    0.309       0.003    0.054
    distboundarykm       0.000      0.001    0.256    0.798       0.000    0.008
    distedgekm          -0.000      0.000   -0.156    0.876      -0.000   -0.005
    logprey              0.017      0.011    1.636    0.102       0.017    0.072
    human_occ           -0.410      0.178   -2.302    0.021      -0.410   -0.250
    livest_occ           0.743      0.297    2.499    0.012       0.743    0.360

Covariances:
                    Estimate     Std.Err    z-value  P(>|z|)   Std.lv     Std.all
  human_occ ~~                                                                   
    livest_occ            0.003      0.000    6.361    0.000       0.003    0.772
  mgmt01 ~~                                                                      
    avgprecip            -1.200      0.148   -8.097    0.000      -1.200   -0.483
    pctsavanna            0.007      0.003    2.163    0.031       0.007    0.149
    riverkm_perkm2      -83.785     44.061   -1.902    0.057     -83.785   -0.140
    roadkm_perkm2        -0.223      0.083   -2.676    0.007      -0.223   -0.190
    distboundarykm        0.291      0.113    2.566    0.010       0.291    0.166
    distedgekm           -0.034      0.277   -0.123    0.902      -0.034   -0.009
    avg_FRP               0.303      0.178    1.695    0.090       0.303    0.117
  avgprecip ~~                                                                   
    pctsavanna           -0.192      0.032   -6.005    0.000      -0.192   -0.424
    riverkm_perkm2      867.678    471.086    1.842    0.065     867.678    0.142
    roadkm_perkm2         2.594      0.742    3.493    0.000       2.594    0.217
    distboundarykm       -1.745      1.077   -1.620    0.105      -1.745   -0.098
    distedgekm           -0.278      2.782   -0.100    0.920      -0.278   -0.007
    avg_FRP               0.576      1.480    0.389    0.697       0.576    0.022
  pctsavanna ~~                                                                  
    riverkm_perkm2      -31.479      7.781   -4.045    0.000     -31.479   -0.288
    roadkm_perkm2        -0.063      0.016   -3.926    0.000      -0.063   -0.293
    distboundarykm        0.110      0.021    5.367    0.000       0.110    0.345
    distedgekm            0.054      0.045    1.186    0.235       0.054    0.075
    avg_FRP              -0.067      0.030   -2.245    0.025      -0.067   -0.142
  riverkm_perkm2 ~~                                                              
    roadkm_perkm2       784.908    206.882    3.794    0.000     784.908    0.271
    distboundarykm     -638.597    266.009   -2.401    0.016    -638.597   -0.148
    distedgekm         2651.124    622.777    4.257    0.000    2651.124    0.276
    avg_FRP             213.120    397.372    0.536    0.592     213.120    0.034
  roadkm_perkm2 ~~                                                               
    distboundarykm       -2.038      0.493   -4.131    0.000      -2.038   -0.241
    distedgekm            0.330      1.317    0.250    0.802       0.330    0.017
    avg_FRP               0.361      0.755    0.479    0.632       0.361    0.029
  distboundarykm ~~                                                              
    distedgekm            3.655      1.744    2.096    0.036       3.655    0.130
    avg_FRP              -1.400      1.427   -0.981    0.327      -1.400   -0.075
  distedgekm ~~                                                                  
    avg_FRP              -0.118      3.140   -0.038    0.970      -0.118   -0.003

Variances:
                   Estimate     Std.Err    z-value  P(>|z|)   Std.lv     Std.all
   .lion_occ             0.000      0.000    5.528    0.000       0.000    0.707
   .logprey              0.173      0.020    8.741    0.000       0.173    0.726
   .competitor_occ       0.005      0.001    5.608    0.000       0.005    0.346
    human_occ            0.005      0.001    6.165    0.000       0.005    1.000
    livest_occ           0.003      0.001    5.619    0.000       0.003    1.000
    mgmt01               0.244      0.006   42.523    0.000       0.244    1.000
    avgprecip           25.290      2.607    9.701    0.000      25.290    1.000
    pctsavanna           0.008      0.001   10.647    0.000       0.008    1.000
    riverkm_perkm2 1474505.124 111777.802   13.191    0.000 1474505.124    1.000
    roadkm_perkm2        5.672      0.580    9.783    0.000       5.672    1.000
    distboundarykm      12.603      1.374    9.173    0.000      12.603    1.000
    distedgekm          62.778      5.988   10.485    0.000      62.778    1.000
    avg_FRP             27.339      4.955    5.517    0.000      27.339    1.000

Thank you in advance for any help you can give!

Best Answer

Welcome Kirby!

lavaan usually defaults to estimating correlations between observed variables (and when you specify them--it doesn't appear you have--latent variables) unless you tell it to otherwise. lavaan provides a shorthand option for overriding this default when dealing with latent variables (using orthogonal = TRUE in cfa or sem), but this won't help you here because all of your correlations are among observed variables--you'll need to manually fix each of these to a value of zero (i.e., thereby indicating you are not interested in estimating them/are comfortable assuming they take on a value of 0).

The tutorial materials on the lavaan website give a good overview of how to fix parameters in this fashion, but as an example, fixing all the correlations to 0 involving the mgmt01 variable would look like this:

mgmt01 ~~ 0*avgprecip + 0*pctsavanna + 0*riverkm_perkm2 + 0*roadkm_perkm2 + 0*distboundarykm + 0*distedgekm + 0*avg_FRP

The tl;dr: here is that with lavaan, it's often valuable (though potentially annoying) to specify everything that you do/do not want estimated, in order to be sure you're getting exactly the model you want.

Regarding the warning messages you're getting, I find it's sometimes helpful to sketch out your path diagram to make sure identification isn't a problem and that you haven't coded a linear dependency somewhere--in this case, I share your intuition that identification isn't the problem. A more plausible candidate, in my opinion, is that you're asking for an awful lot of estimates/inferences from a relatively modest sample of data, and estimation errors under these kinds of conditions aren't uncommon. This might clear up after you constrain all those correlations you don't want to zero, but otherwise it might be a case where you need more data.

Related Question