The composite variable
- Your algorithm for calculating scale scores seems fairly standard. Do you have a question about it?
- In most cases whether you take the mean of all items in the scale or whether you create subscale means and then take the mean of the subscales, both options are likely to give similar answers where the number of items per subscale is similar.
Estimating correlations between latent variables
Structural equation modelling adjust estimates of correlations between latent variables for reliability of measurement estimated from the intercorrelation between the items and the model.
Manually calculating scale scores (as you appear to be doing) and correlating these scale scores does not involve such an adjustment. As such you are correlating observed variables, you are not estimating the correlation between latent variables. Many researchers, probably the majority, report correlations between observed variables. So, if you adopt this statistically simpler approach, you would be in good company. However, if you are particularly interested in estimating correlations between latent variables, then I would encourage you to explore structural equation modelling approaches.
An alternative approach is just to adjust the correlation based on some estimate of reliability of the two variables (see this discussion).
Correlations on scales based on likert scales
See this answer by @chl on whether to treat likert scales as interval or ordinal.
My opinion is that once you are summing over a reasonable number of items, treating the data as interval is typically useful.
Describing your system for forming a composite
You have weighted a set of items using a standard system (i.e., equal distance between numbers for categories [0,1,2,3,4,5], the same scale for each item; items scores then summed to form a scale score).
I would not call the above system "subjective".
It is probably the most common system for scoring psychology and social science multi-item scales when items use the same response scale.
I imagine that you are contrasting such a scoring system with one based on a factor analysis or a related procedure.
Validity of your composite
You are adopting a standard scoring system, and there are good reasons why this system is so common.
- It's very easy to communicate to others how the scoring system works, and therefore can readily be applied in multiple contexts.
- It is not specific to a given sample (in contrast to weightings derived from a factor analysis; although such weightings could be fixed in one sample and applied in others).
- Many scales are designed so that each item is designed to measure the scale and therefore summing over these items is designed to measure the construct.
- Because each item is on the same scale (and the number of scale points is relatively small), the standard deviation for items tend to be fairly similar, and thus the contribution of the item to the scale total tends to be fairly similar.
Nonetheless, such a scoring system is predicated on the idea that each item is a good measure of the underlying construct.
The broader issue of validity relates to whether the scale you have created is a valid measure of whatever it is meant to be a measure of.
There are a wide range of procedures that people use to establish the validity of a given scale both in general, and for a specific sample.
- Factor analysis and reliability analysis are two obvious ways to assess the internal structure of a scale or set of scales.
- Correlating the scale with other measures is another strategy.
Much more could be said about validity and scale construction (see here for some references).
Whether to convert items to Z-scores
Standardising items first before creating a scale mean or sum is one simple way of ensuring that each item has equivalent "importance" in forming a composite.
The problem of unequal importance is more of a problem when combining component variables that are on very different scales (e.g., height in mm, weight in kg).
There are also issues in comparability across studies when adopting a sample specific z-score approach.
In your case, whether you convert each item to a z-score first before summing items or whether you just sum items, my guess is that the two variables will correlate very highly (perhaps greater than .95, but you can check this).
In general the simple sum (or mean) of raw scores is preferable from a comparability perspective. It also communicates how the mean relates to the underlying scale (e.g., a mean of 4.2 on a 1 to 5 scale where 5 means very satisfied indicates that the sample is generally satisfied, whereas the sum of z-scores does not).
Best Answer
Composite score, scale score, and index score are terms that are often used interchangeably. As to your question 2, I'm not sure what makes you hesitate to use correlation; the usual assumptions involving correlation should apply here. Otherwise, you'll find more information at this page on this site