Solved – What are the primary differences between Taxometric analyses (e.g., MAXCOV, MAXEIG) and Latent Class analyses

clusteringlatent-classlatent-variablemodel-based-clusteringpsychology

Recent research has attempted to determine if certain psychological constructs are latently dimensional or taxonic (i.e., including taxons or classes). For example, researchers may be interested in finding out if there is a certain "class" of people who are more likely to develop chronic pain after an injury, or if the risk of developing chronic pain is better conceptualized as dimensional ranging from limited risk to extremely high risk. I've noticed that researchers attempt to answer these types of questions using two types of analyses: Taxometric analyses (MAMBAC, MAXEIG, MAXCOV) typically conducted in R, and Latent Class analyses.

Here are some examples of Taxometric studies:

Here are some examples using Latent Class analyses:

Here are my questions:

  1. In English, what are the primary differences between these two types of analyses? If possible, elaborate if they answer different questions and how they are analytically (mathematically) different.

  2. Which one is better for answering the type of question I highlighted in my "introduction", and why? Perhaps this is really unanswerable at this point.

Also please share any information you feel may be relevant to this topic. I have a feeling I will have follow-up questions!

Best Answer

See Tueller (2010), Tueller and Lubke (2010), and [Ruscio et al.'l book][3] for complete detail on what is summarized below. Taxometric procedures generally work by computing simple statistics on subset of sorted data. MAMBAC uses the mean, MAXCOV uses the covariance, and MAXEIG using the eigen value. Latent class analysis is a special case of the general latent variable mixture model (LVMM). The LVMM specifies a model for the data which may include latent classes, latent factors, or both. Parameters of the model are obtained using maximum likelihood or Bayesian estimates. Refer to the literature above for complete detail.

What is more important that the mathematical underpinnings (which are beyond the scope of this forum) are the hypotheses that can be tested under each approach. Taxometric procedures test the hypothesis

H1: Two classes explain all (or most) of the observed correlation among a set of indicators H0: One (or more) continuous underlying dimension(s) explain all of the observed correlation among a set of indicators

Usually the CCFI is used to ascertain which hypothesis to reject/retain. See [John Ruscio's book on the topic][4]. Taxometric procedures can test only these two hypothesis and no others.

Used alone, latent class analysis cannot test the taxometric alternative hypothesis, H0 above. However, latent class analysis can test the following alternative hypotheses:

H1a: Two classes explain all of the observed correlation among a set of indicators H1b: Three classes explain all of the observed correlation among a set of indicators ... H1k: k classes explain all of the observed correlation among a set of indicators

To test H0 from above in a latent variable framework, fit a single factor confirmatory factor analysis (CFA) model to the data (call this H0cfa which is different from H0 - H0 only tests a hypothesis of fit under the taxometric framework, but doesn't produce parameter estimates as you would get by fitting a CFA model). To compare H0cfa to H1a, H1b, ..., H1k, use the Bayesian Information Criterion (BIC) ala [Nylund et al. (2007)][5].

To summarize thus far, taxometric procedures can look at two vs. one class solutions, while latent class + CFA can test one vs. two or more class solutions. We see that taxometric procedures test a subset of the hypotheses tested by latent class + CFA model comparisons.

All of the hypotheses present thus far are extremes at two ends of a spectrum. The more general hypothesis is that some number of latent classes and some number of latent dimensions (or latent factors) best explain the data. The approaches described above reject this outright, which is a very strong assumption. Put differently, a latent class model and a taxometric procedure that leads to a conclusion of taxonic structure (rather than dimensional) assume within class individual differences besides random error. In your context, this is equivalent to say that within the chronic pain class, there is no systematic variation in the tendency to develop chronic pain, only random chance.

The weakness of this assumption is better illustrated with an example from psychopathology. Say you have a set of indicators for depression, and your taxometric and/or latent class models lead you to conclude there is a depressed class and a non-depressed class. These models implicitly assume no variance in severity of depression within class (beyond random error or noise). In other words, you are depressed, or you are not, and among the depressed everyone is equally depressed (beyond variation in error prone observed variables). So we only need one treatment for depression at one dose level! It is easily seen that this assumption is absurd for depression, and is often just as limited for most other research contexts.

To avoid making this assumption, use a factor mixture modeling approach following the papers of [Lubke and Muthen and Lubke and Neale][6].