Solved – In CFA, does it matter which factor loading is set to 1

confirmatory-factorstructural-equation-modeling

I'd been previously taught that, aside from the fact that fixing a loading to 1 means you won't get a significance test on that loading, it was totally arbitrary which loading got fixed to 1.

However, a noted authority on SEM (Jeremy Miles) makes an interesting comment here that

It doesn't make any difference empirically which is fixed – it
rescales the loadings. Sometimes it makes theoretical sense to choose
one of the variables to have its loading fixed to one – this is the
variable with the closest conceptual relationship to the latent
variable of interest.

Would anyone care to explain why it can make theoretical sense to fix the variable with the closest conceptual relationship to the latent variable to 1? Why does this make sense "sometimes" and not always?

Best Answer

To add (and then to digress a bit...): selecting a particular marker variable over another can be a reasonable thing to do if one is known to be a high-consensus "gold-standard" indicator of your latent variable of interest (Little, 2013). Imagine you have three tests $x_1$, $x_2$, and $x_3$, that attempt to assess Latent Variable $X$. Perhaps $x_1$ and $x_2$ are cheap/quick/easy to administer assessments--they are convenient to the researcher, and capture some amount of signal in $X$, but are known to not be the most reliable/valid assessment tools. $x_3$, meanwhile, is perhaps a longer assessment that's been put through a more rigorous development process, and though perhaps not quite as convenient to use, is a more reliable/valid indicator of $X$.

In this contrived hypothetical, it would make good sense to use $x_3$ as your marker variable for scale-setting, as opposed to $x_1$ (which would be the default marker variable selected by many SEM software options) or $x_2$. As most texts will aptly point out, however, the choice of marker variable won't make a lick of difference for your indices of model fit, so why does the selection matter? In the context of estimating a measurement model for $X$, the answer is that by anchoring to $x_3$, you will get a more accurate estimate of the variance of $X$ (this is mentioned in the Steiger, 2002, paper that Jeremy Miles references), because $x_3$ ostensibly has a much higher factor loading and contains less unique/error variance than either $x_1$ and/or $x_2$.

Cue digression

In many applications of CFA/SEM, the selection of a particular marker variable impacts more than just the estimate of a latent variance, in ways that others (and I) think is deeply problematic. Stated simply: your choice of marker variable will impact the estimate and standard error (and therefore p-values) of structural associations with other latent variables. This may be acceptable (or even desirable) in a case like my above hypothetical--when the gold-standard indicator is clear--but in many cases, no such consensus of which indicator is best exists, and you can get different patterns of results depending on which you select.

Here is a reproducible example showing how the selection of marker variable impacts estimation/testing, using the HolzingerSwineford1939 data set and the lavaan package:

Fit the same model (predicting latent textual from latent visual), and vary the selected marker variables for each ($x_1$/$x_4$, $x_2$/$x_5$, $x_3$/$x_6$):

model.x1 <- '
visual  =~ 1*x1 + x2 + x3
textual =~ 1*x4 + x5 + x6
textual ~ visual
'

model.x2 <- '
visual  =~ NA*x1 + 1*x2 + x3
textual =~ NA*x4 + 1*x5 + x6
textual ~ visual
'

model.x3 <- '
visual  =~ NA*x1 + x2 + 1*x3
textual =~ NA*x4 + x5 + 1*x6
textual ~ visual
'

The fit of each model is identical (marker variable selection doesn't impact it), $\chi^2$(8) = 24.361, p = .002. But the estimated slope and statistical test of regressing textual on visual does change:

  1. $b$ = 0.503, z = 5.235
  2. $b$ = 1.000, z = 4.745
  3. $b$ = 0.658, z = 5.386

In absence of a clear gold-standard indicator of either textual or visual, this presents (or rather, and in my opinion, it should present) a fairly large problem to those wishing to use a marker variable approach to scale setting. For my part, though it has yet to be studied formally from this perspective, I see the arbitrary and (potentially) flexible selection of marker variables as a "researcher degree of freedom" (John et al., 2012; Simmons et al., 2011) ripe for exploitation in the measurement/structural modelling context (e.g., Flake & Fried, 2019).

What's the alternative, then? One approach is to fix the latent variances to 1 (often accompanied by fixing the latent means, when modelling mean structures, to 0), effectively standardizing the latent variable, and allowing a factor loading for each indicator to be estimated:

model.ff <- '
visual  =~ NA*x1 + x2 + x3
textual =~ NA*x4 + x5 + x6
visual ~~ 1*visual
textual ~~ 1*textual
textual ~ visual
'

Model fit is the still the same, but like before, we get a different estimated slope/statistical test based on our choice of scale-setting: $b$ = 0.519, z = 5.668. Why is this approach potentially preferable? First, researchers are often more interested in the factor loadings of their indicators than in the estimate of their latent variances. Second--and perhaps what carries the argument in most cases--though standardized scalings for latent variables are still arbitrary in a sense, it's an approach that is at least scientifically normative (i.e., we standardize variables all the time and no one seems to lose their heads).

There's a third option for scale-setting though, for those wanting a solution that is non-arbitrary while still providing estimates of all factor loadings: effects coding (Little et al., 2006). Here, we constrain all loadings to average 1 (and if we were modelling a mean structure, we would constrain all item intercepts to equal 0):

model.ec <- '
visual  =~ NA*x1 + a*x1 + b*x2 + c*x3
textual =~ NA*x4 + d*x4 + e*x5 + f*x6
a+b+c ==3
d+e+f ==3
textual ~ visual
'

Model fit is the same, we get estimated loadings for all indicators and estimated latent variances for both factors, and a slightly different estimate/test of the latent slope: $b$ = 0.674, z = 6.159. The main perk of effects coding is that it puts the latent variable back on the original metric/scale of all of its specified indicators, and so if you estimated its latent mean, it would be the same mean (on the same scale) that you would get by calculating a crude average of the--just with a smaller latent variance because you've removed error/unique variance--lending itself to a more intuitive interpretation.

tl;dr: Selecting a particular marker variable is worth while when your indicators are on different scales and you want your LV on the scale of one of them in particular (as per Noah's good answer), and/or when one of your indicators is a gold-standard indicator. The latter case, however, is often unclear, and given its impact on estimates/tests of structural parameters, I think it's a somewhat questionable thing to do without strong evidence.

References

Flake, J. K., & Fried, E. I. (2019, January 17). Measurement Schmeasurement: Questionable Measurement Practices and How to Avoid Them. https://doi.org/10.31234/osf.io/hs7wm

John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23, 524-532.

Little, T. D. (2013). Longitudinal Structural Equation Modeling. New York, NY: Guilford Press.

Little, T. D., Slegers, D. W., & Card, N. A. (2006). A non-arbitrary method of identifying and scaling latent variables in SEM and MACS models. Structural Equation Modeling, 13, 59-72.

Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological science, 22(11), 1359-1366.

Steiger, J. H. (2002). When constraints interact: A caution about reference variables, identification constraints, and scale dependencies in structural equation modeling. Psychological Methods, 7, 210-227.