Solved – SEM model in lavaan: Can’t compute standard errors

lavaanrstructural-equation-modeling

I was playing around in lavaan to bind together two simple models I previously tested on their own via simple regression analyses. I followed the tutorial provided on the following site: http://lavaan.ugent.be/tutorial/sem.html.

Basically, I have two latent variables ($lv1$ and $lv2$) one has three manifest indicators ($x1$, $x2$, $x3$), the other one has four ($x1$, $x2$, $x3$, $x4$). Both variables predict another variable $y$. Visually speaking:

The data used only contains positive values. I modeled this as following in R and lavaan:

semModel <- '
  # measurement models
    lv1 =~ x1 + x2 + x3
    lv2 =~ x1 + x2 + x3 + x4
  # regressions
    y ~ lv1
    y ~ lv2
  # residual correlations
    x1 ~~ x2
    x1 ~~ x3
    x1 ~~ x4
    x2 ~~ x3
    x2 ~~ x4
    x3 ~~ x4
'

Then I ran the following:

fit <- sem(semModel1, data = experimentalData)
summary(fit)

This returned the following errors:

1: In lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats,  :
  lavaan WARNING:
    Could not compute standard errors! The information matrix could
    not be inverted. This may be a symptom that the model is not
    identified.
2: In lav_object_post_check(object) :
  lavaan WARNING: some estimated lv variances are negative

I then added the option std.ov to standardise observed variables which still yields the error regarding the standard errors.

fit <- sem(semModel, data = experimentalData, std.ov = TRUE)

In lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats,  :
  lavaan WARNING:
    Could not compute standard errors! The information matrix could
    not be inverted. This may be a symptom that the model is not
    identified.

In the second case, output is as following:

lavaan 0.6-5 ended normally after 32 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of free parameters                         21

  Number of observations                           583

Model Test User Model:

  Test statistic                                    NA
  Degrees of freedom                                -6
  P-value (Unknown)                                 NA

Parameter Estimates:

  Information                                 Expected
  Information saturated (h1) model          Structured
  Standard errors                             Standard

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)
  lv1 =~                                        
    x1                1.000                           
    x2                1.155       NA                  
    x3                1.691       NA                  
  lv2 =~                                       
    x1                1.000                           
    x2                2.006       NA                  
    x3                2.224       NA                  
    x4                1.422       NA                  

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)
  y ~                                        
    lv1               0.020       NA                  
    lv2               0.889       NA                  

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)
 .x1   ~~                                            
   .x2                0.033       NA                  
   .x3                0.134       NA                  
   .x4               -0.003       NA                  
 .x2   ~~                                            
   .x3               -0.104       NA                  
   .x4               -0.202       NA                  
 .x3   ~~                                            
   .x4                0.013       NA                  
  lv1  ~~                                        
    lv2              -0.159       NA                  

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)
   .x1                0.918       NA                  
   .x2                0.415       NA                  
   .x3                0.444       NA                  
   .x4                0.405       NA                  
   .y                 0.772       NA                  
    lv1               0.105       NA                  
    lv2               0.294       NA

Where did I miss something? Are there (logical) errors in the definition of my model?

Best Answer

The model is not identified, which means there is no unique solution to the estimation problem. Identification is a challenging topic, one that is often overlooked. First, your graphical model is incorrect. You have manifest variables pointing to the latent variables, when in your model, the manifest variables measure the latent variable. Second, the cause for the lack of identification is the residual covariances among all the indicators and the fact that all the indicators load on both latent variables. With so few indicators and most of them shared between the latent variables, you cannot supply any residual covariances. In general, each latent variable needs two unique indicators (i.e., unique to each latent variable), and residual covariances generally cannot be included without more than four indicators.

I highly recommend you look at the identification chapter in an SEM textbook, like Bollen (1989). There are specific rules to identification and ways to assess whether your model is identified.

Related Solutions

Solved – lavaan WARNING: could not compute standard errors!

If I understand your model correctly, you are trying to fit a bifactor model: There is a "general" factor G that is assumed to underlie all the manifest variables, and there are "specific" factors, F1 and F2, each one explaining variance in a subdomain of items. One important aspect of bifactor model is: All factors (general and specific) must be orthogonal to each other.

"A bifactor structural model specifies that the covariance among a set of item responses can be accounted for by a single general factor that reflects the common variance running among all scale items and group factors that reflect additional common variance among clusters of items, typically, with highly similar content. It is assumed that the general and group factors all are orthogonal." (Reise, 2012, p. 668)

I am not familiar with your data and your model so I cannot evaluate what is going on. However, there is one thing can be said for sure based on the information you provided:

Your code allows factors to be correlated to each other; this leads your model to be non-identified. Function cfa in lavaan permits latent factors to be correlated by default: If you enter ?cfa in your R console, you will find that cfa sets orthogonal = FALSE.

Assuming the bifactor model is appropriate for your data, the easiest way to solve your issue is to add orthogonal = TRUE to your code:

fit <- cfa(adhd.model, data=dataset, std.lv=TRUE, orthogonal=TRUE)

[There are other ways to force factors to be orthogonal via lavaan syntax.]

Reise, Steven P. (2012). "The rediscovery of bifactor measurement models." Multivariate Behavioral Research 47 (5): 667-696.

Solved – Why does the SEM (‘lavaan’) return covariance estimates for all pairwise variable combinations that were not specified in the model

Welcome Kirby!

lavaan usually defaults to estimating correlations between observed variables (and when you specify them--it doesn't appear you have--latent variables) unless you tell it to otherwise. lavaan provides a shorthand option for overriding this default when dealing with latent variables (using orthogonal = TRUE in cfa or sem), but this won't help you here because all of your correlations are among observed variables--you'll need to manually fix each of these to a value of zero (i.e., thereby indicating you are not interested in estimating them/are comfortable assuming they take on a value of 0).

The tutorial materials on the lavaan website give a good overview of how to fix parameters in this fashion, but as an example, fixing all the correlations to 0 involving the mgmt01 variable would look like this:

mgmt01 ~~ 0*avgprecip + 0*pctsavanna + 0*riverkm_perkm2 + 0*roadkm_perkm2 + 0*distboundarykm + 0*distedgekm + 0*avg_FRP

The tl;dr: here is that with lavaan, it's often valuable (though potentially annoying) to specify everything that you do/do not want estimated, in order to be sure you're getting exactly the model you want.

Regarding the warning messages you're getting, I find it's sometimes helpful to sketch out your path diagram to make sure identification isn't a problem and that you haven't coded a linear dependency somewhere--in this case, I share your intuition that identification isn't the problem. A more plausible candidate, in my opinion, is that you're asking for an awful lot of estimates/inferences from a relatively modest sample of data, and estimation errors under these kinds of conditions aren't uncommon. This might clear up after you constrain all those correlations you don't want to zero, but otherwise it might be a case where you need more data.

Best Answer

Related Solutions

Solved – lavaan WARNING: could not compute standard errors!

Solved – Why does the SEM (‘lavaan’) return covariance estimates for all pairwise variable combinations that were not specified in the model

Related Question