Solved – How are Cronbach’s Alpha and Average Variance Extracted different in their interpretation

confirmatory-factorcronbachs-alphareliabilitystructural-equation-modelingvalidity

Background: I do survey research on Information Systems. I have a latent construct that I measure with six survey items and reflective measurement model. Each survey item has a seven point Likert scale. This construct and these items are embedded in a larger structural equation model (SEM). I’m wondering about assessing the output of PLS-SEM estimation. Literature suggests various criteria. I’m specifically wondering about internal consistency reliability (ICR) as measured by Cronbach’s alpha and convergent validity as measured by average variance extracted. I’m confused by how their interpretation differs.

As I understand it

  • ICR is a measure based on the correlations between different items. It is commonly assessed by Cronbach’s alpha. Cronbach’s alpha can be written as a function of the number of survey items and the average inter-correlation among these items. In my example, alpha is well above the conventional threshold of 0.7.
    Con
  • Convergent validity refers to the degree to which a measure is correlated with other measures that it is theoretically predicted to correlate with. Average variance extracted (AVE) is commonly used to assess convergent validity. To calculate AVE of my latent construct, I take the loadings of the six items on the construct and calculate the average of squared loadings. In my example, AVE is well below the conventional threshold of 0.5.

My questions are

  1. To me, AVE and alpha appear rather related. If each item by itself is a good measure for the latent construct, all items will load highly on the construct (i.e. high AVE) and all items will correlate with each other (i.e. high alpha). How come two rather related statistics can be used to assess both reliability and validity?
  2. In my example, I have alpha > 0.7 (i.e. internal consistency is ok), AVE < 0.5 (i.e. lack of convergent validity) but AVE greater than squared inter-construct correlations (i.e. discriminant validity is ok). How to interpret this situation?

Side remarks

  • I know that for assessing discriminant validity, AVE is commonly
    compared to squared inter-construct correlations.
  • I read a few times (but did not understand) that high values for alpha do not imply unidimensionality. However, given that to me unidimensionality is different from both reliability and validity, I don’t know how this would solve my question.

Best Answer

I think your conceptual understandings of reliability (via Cronbach's $\alpha$) and convergent validity are correct. However, I believe that the way you have defined evidence for convergent validity is mistaken. Your reflective model implies that these six items are manifestations (i.e., caused by) of your latent construct; to then use these same indicators as "...other measures that it [your latent variable that is presumably causing these indicators] is theoretically predicted to correlate with" seems very circular. How can the variables be considered manifestations of your latent variable, and "other measures" at the same time? Instead, I think you should be establishing convergent validity via inter-construct correlations, much like you would discriminant validity.

Two other quick thoughts:

1) I've not often seen Cronbach's $\alpha$ calculated for latent variables. Rather, Cronbach's $\alpha$ is often calculated for scale scores (averages, sums) that are observed. You might be interested in calculating construct (or sometimes called "composite") reliability (Hatcher, 1994), which can be done with the following formula: ($\Sigma$$\lambda$)$^2$/(($\Sigma$$\lambda$)$^2$+$\Sigma$$\sigma$$^2$) where $\lambda$ is a standardized loading, and $\sigma$$^2$ is a uniqueness.

2) Your AVE seems similar, in concept, to the calculations for how much variance (similar to the previous formula) is explained by a given latent variable. This calculation could be taken as some preliminary evidence of construct validity, as if your latent variable is not explaining a substantial amount of variance in it's indicators (e.g., >.5), then perhaps it is a poorly conceived latent variable: ($\Sigma$$\lambda$$^2$)/(($\Sigma$$\lambda$$^2$)+$\Sigma$$\sigma$$^2$)