Solved – lavaan works with `cfa`, provides error with lavaan(method = ‘cfa’)

lavaanrstructural-equation-modeling

Estimating a simple Confirmatory Factor Analysis (CFA) model with lavaan gives inconsistent results when I use cfa and lavaan(method = "cfa").

library(lavaan)

10 observations, four indicators

foo <-  structure(list(ind1 = c(0.2, -1, -0.2, 0.6, 0.6, -0.2, -0.2, 
                            -1, 0.6, 0.2), 
                   ind2 = c(0.6, -0.2, -0.2, 0.6, 0.6, -0.2, -1,
                            -1, 0.6, 0.6),
                   ind3 = c(0.6, -0.2, -0.2, 0.6, 0.2, 0.2, -0.6, 
                            0.6, 0.6, 0.6), 
                   ind4 = c(1, -0.5, -1, 0.5, 0.5, -0.5, 0, 0, -0.5, 
                            0.5)), 
              codepage = 65001L, .Names = c("ind1", "ind2", "ind3", "ind4"),
              row.names = 1:10,
              class = "data.frame")

This model

cfa(model = 'latent =~ ind1 + ind2 + ind3 + ind4', data = foo)

Works fine. But this, which I thought was equivalent

lavaan(model = 'latent =~ ind1 + ind2 + ind3 + ind4',
   data = foo, 
   model.type = "cfa")

causes an error:

Error in lav_model_estimate(lavmodel = lavmodel, lavsamplestats = lavsamplestats,  : 
  lavaan ERROR: initial model-implied matrix (Sigma) is not positive definite;
  check your model and/or starting parameters.

What am I missing?

Best Answer

The cfa() function is a wrapper for lavaan, which (among other things) adds the arguments

auto.fix.first = TRUE
auto.var = TRUE

Changing your model type to CFA changes a few cosmetic things, but not the basic model. Because the lavaan() command has not put these in, you need to add the arguments, or add the appropriate parameters to the model.

lavaan(model = 'latent =~ 1*ind1 + ind2 + ind3 + ind4
                latent ~~ latent
                ind1 ~~ ind1
                ind2 ~~ ind2
                ind3 ~~ ind3
                ind4 ~~ ind4',
       data = foo, 
       model.type = "cfa")

That seems like more work, but I prefer to (almost) always use the lavaan() function - the trouble with using cfa() is that it does some of the work for you, but you can forget what it has done, and what you want to do. You can therefore screw up by, say, fixing the variance of the latent to 1 AND fixing the first loading to 1 (because the second thing is done automatically, you might forget). But cfa() is much easier for simple models that you're not going to futz around with much.

Related Solutions

R – How to Use lavaan (SEM/CFA) for Factor Analysis Like factanal (EFA)

It is possible to do EFA in a CFA framework. This is sometimes called "E/CFA". A nice discussion of this can be found in:

Brown, T. A. (2006). Confirmatory factor analysis for applied research. New York: Guilford Press.

For this to work, you need to have an "anchor item" for each factor, for which there are no cross-loadings. Looking at the results from factanal, it would make sense to make item x5 the anchor item for the first factor and x3 the anchor item for the second factor. You also need to constrain the variances of the latent factors to 1. And since factanal standardizes the variables, you'll want to do the same here. At the same time, to make things more comparable, I would use an oblique rotation method for EFA (since the E/CFA model will allow the factors to be correlated). So, putting this all together, you can compare:

model <- 'f1 =~ x1 + x2 + 0*x3 + x4 + x5 + x6
          f2 =~ x1 + x2 + x3 + x4 + 0*x5 + x6'

fit  <- cfa(model, data=HolzingerSwineford1939, std.lv=T, std.ov=T)
summary(fit)

factanal(~x1+x2+x3+x4+x5+x6, factors=2, data=HolzingerSwineford1939, rotation="promax")

Here are the results:

lavaan (0.5-18) converged normally after  26 iterations

  Number of observations                           301

  Estimator                                         ML
  Minimum Function Test Statistic                2.109
  Degrees of freedom                                 4
  P-value (Chi-square)                           0.716

Parameter estimates:

  Information                                 Expected
  Standard Errors                             Standard

                   Estimate  Std.err  Z-value  P(>|z|)
Latent variables:
  f1 =~
    x1                0.275    0.061    4.499    0.000
    x2                0.092    0.063    1.469    0.142
    x3                0.000
    x4                0.822    0.050   16.550    0.000
    x5                0.875    0.049   17.964    0.000
    x6                0.798    0.050   16.013    0.000
  f2 =~
    x1                0.559    0.074    7.520    0.000
    x2                0.441    0.071    6.183    0.000
    x3                0.746    0.086    8.644    0.000
    x4                0.120    0.052    2.321    0.020
    x5                0.000
    x6                0.161    0.052    3.085    0.002

Covariances:
  f1 ~~
    f2                0.119    0.088    1.348    0.178

Variances:
    x1                0.572    0.072
    x2                0.784    0.074
    x3                0.440    0.112
    x4                0.283    0.035
    x5                0.231    0.037
    x6                0.303    0.035
    f1                1.000
    f2                1.000

> 
> factanal(~x1+x2+x3+x4+x5 + x6, factors = 2, data = HolzingerSwineford1939, rotation="promax")

Call:
factanal(x = ~x1 + x2 + x3 + x4 + x5 + x6, factors = 2, data = HolzingerSwineford1939,     rotation = "promax")

Uniquenesses:
   x1    x2    x3    x4    x5    x6 
0.574 0.787 0.441 0.284 0.232 0.304 

Loadings:
   Factor1 Factor2
x1  0.180   0.557 
x2          0.457 
x3 -0.147   0.797 
x4  0.842         
x5  0.922  -0.126 
x6  0.809         

               Factor1 Factor2
SS loadings      2.268   1.173
Proportion Var   0.378   0.196
Cumulative Var   0.378   0.574

Factor Correlations:
        Factor1 Factor2
Factor1   1.000   0.417
Factor2   0.417   1.000

Test of the hypothesis that 2 factors are sufficient.
The chi square statistic is 2.07 on 4 degrees of freedom.
The p-value is 0.722

According to Brown (2006), the chi-square statistic should be the same for both approaches. For the example shown in the book, this is indeed the case (but the author uses Mplus for the EFA and E/CFA analyses). In this particular example, the values are close (2.109 and 2.07), but not identical. There seems to be some minor difference in how lavaan and factanal are working here, but ultimately the point is that one can indeed do an exploratory factor analysis using CFA software.

Solved – bi-factor cfa, multiple method factors, DWLS vs MLS in lavaan

When you have ordinal data, and a relatively complex model the traditional mls often will not converge. If that is what is happening use dwls.

Best Answer

Related Solutions

R – How to Use lavaan (SEM/CFA) for Factor Analysis Like factanal (EFA)

Solved – bi-factor cfa, multiple method factors, DWLS vs MLS in lavaan

Related Question