Solved – What factors do I have to consider when deciding if creating a composite score using four t-scores is defensible psychometrically

compositecorrelationpsychometricsreliability

Edit:
I've conducted an item analysis on four variables – two quizzes and two psychological attributes, namely emotional intelligence and happiness. Its a within-subject design with students as participants. They all have Cronbach's alpha above .70 or have been revised to meet that criteria. I have a scenario where a selection panel chooses to select students based on their performance in all four composite variables. It was then decided that we should calculate an average standardized t-score across the four variables. Would creating a composite score out of four t-scores be psychometrically defensible? why or why not?

I've also correlated all for variables and they were all statistically significant with the lowest being .30. Is that considered decent enough?

the quiz scales and attribute scales are different but if using t-scores then would they be considered standardized?

thanks for any help!=D

Best Answer

General points on composites using t-scores

By t-scores, I assume you are saying that you have four variables each of which have been standardised so that the mean is 50 and the standard deviation is 10.

By using standardised scores (i.e., t-scores) as your component items, you are in some sense ensuring that the weighting of each variable in your composite is equal. This is often a desirable property where you believe conceptually that each variable deserves equal weight in the composite.

One issue with using standardised scores when forming composites is that the means and standard deviations used to form the component t-scores can change. This can then prevent the comparability of total scores. Thus, I would recommend that if you do use t-scores as the component variables, make sure that the formulas that go from raw scores to t-scores don't change over time.

Another point is that the mean or sum of a t-score will not be a t-score, but you could restandarise the total score.

Justifications for indexes

More generally, there are several justifications for creating a composite including:

  1. Common construct: the four component variables correlate and are theorised to reflect a particular construct.
  2. Reflects a conceptual category: The composite reflects a conceptual entity made up of multiple elements that may not necessarily be correlated. For examples, measures of city liveability often include quite diverse indicators (e.g., air quality, crime statistics, etc.). Regardless of whether they correlate, they all tap into an important construct which is city liveability.
  3. Predictive index: In some cases an index is formed for its predictive value. For example, a set of tests might be combined to predict job performance. In such a case, the weighting of component variables would often be influenced by a predictive model (e.g., regression coefficients in multiple regression).

These justifications also map broadly onto discussion of reflective and formative indicators (Diamantopoulos & Winklhofer, 2001).

Reliability and validity

All of the above justifications for indexes relate in various ways to reliability and validity. In general, I think most composites in psychology are created because they are believed to reflect a common construct. Thus, we combine items together to measure depression or extraversion or well-being.

I think you need pretty strong justification for combining items that don't correlate and calling that composite something psychologically meaningful.

Both reliability and validity have many meanings and nuances.

There are internal measures of reliability which will use the intercorrelation of component variables to estimate internal consistency reliability (e.g., alpha). That said, if you combine several items that have good test-restest reliability but to each other are uncorrelated, you will still get an overall scale with good test-retest reliability.

Validity means various things. In a broad sense, validity pertains to whether the inferences you want to make from the test are valid. Thus, if you want to use the measure to predict something useful then it's predictive capacity will be relevant issue. If you are trying to accurately reflect a meaningful integrated construct with some external existence then you would probably be looking for something where component variables intercorrelate. In other cases, you may want to be aligning with existing theoretical conceptions of a construct.

Additional material

I also have a large number of notes on composite score formation here around z-scores.

References

  • Diamantopoulos, A., & Winklhofer, H. M. (2001). Index construction with formative indicators: an alternative to scale development. Journal of Marketing research, 269-277. PDF