categorical-data – How to Build Latent Variable Using Categorical Variables in Lavaan SEM

categorical datalavaanstructural-equation-modeling

In my study, I want to investigate the influence of job skills on psychology. Job skill, as an exogenous latent variable, is composed of 3 categorical variables (V1, V2 and V3 here). Psychology, as an endogenous latent variable, is composed of 4 categorical variables (V4, V5 and V6 here).

V1-V3 are nominal variables (1-4). According to the tutorial, I should divide them into three dummy variables (V1.1, V1.2, V1.3, V2.1, V2.2, V2.3, V3.1, V3.2, V3.3). V4-6 are binary variables (0 or 1).

I don't know how to build a model, especially to associate dummy variables with latent variables. I have tried like this. Is this right?

model <- '

  skills =~ V1+V2+V3

  Psy  =~ V4 +V5 +V6
  
  V1 ~ V1.1 + V1.2 +V1.3 
  
  V2 ~ V2.1 +V2.2 +V2.3

  V3 ~ V3.1 +V3.2 +V3.3
 
  Psy ~ skills
'
fit <- sem (model, data=data, ordered =T)

Best Answer

As @Terrence says, "factor analysis in general and lavaan specifically do not have measurement models for nominal indicators. Nominal factor analysis models can be estimated in mplus (e.g., see Revulta, Maydeu-Olivares, & Ximénez 2020)$^1$.

@Terrence also correctly points out that measurement models with nominal indicators are typically estimated using item response theory (IRT) models. Below is some code to help you get started using the mirt package$^2$.

library(mirt)
fit <- mirt(Science, 1, 'nominal')

$^1$Note, however, that there are important differences between nominal factor analysis and other factor analysis models for categorical data. For example, the estimation of nominal factor analysis models in mplus can only be done using maximum likelihood(ML).

$^2$Note that all competent IRT software packages (e.g., flexMIRT, IRTPRO, and mirt) can estimate the (IRT) nominal response model (e.g., see Thissen, Cai, & Bock 2011).

References

Revuelta, J., Maydeu-Olivares, A., & Ximénez, C. (2020). Factor analysis for nominal (first choice) data. Structural Equation Modeling: A Multidisciplinary Journal, 27(5), 781-797.

Thissen, D., Cai, L., & Bock, R. D. (2011). The nominal categories item response model. In Handbook of polytomous item response theory models (pp. 53-86). Routledge.

Related Question