Solved – Can a dichotomous variable (yes/no) be merged with a Likert measure (1,2,3,4) using z scores

likertscales

I would like to create a composite scale using items that measure parent communication extracted from a UK panel survey (n=941). However although all 6 variables in question measure similar constructs – 2 of the items are dichotomous (yes/no) the other 4 are likert (1,2,3,4). My intention is to transform the 6 variables to z scores and then sum them to form the composite score which will be used in a path analysis. I need to know if it is correct to do this, i.e., merge dichotomous variable (yes/No) and Likert scale (1,2,3,4) after they have been standardized to z scores.

Pending this being ok, would it be necessary to do an exploratory factor analysis to ensure the the composite scale measures one underlying construct or it is enough to do a reliability analysis?

This analysis is forming part of my PhD so I need to defend what I am doing and I would appreciate any guidance anyone can give me.

Best Answer

I'm not sure about the first part of your question. But regarding the second bit: a reliability analysis of itself does not tell you if you have one underlying construct or several. You can have a high cronbach-alpha (for reliability) in the presence of two or more factors. Definitely, do the factor analysis as well as reliability. You might also want to check out the latent variable and item response literature. Some of these models are set up to handle dichotomous and polytomous outcomes - which might deal with the z-score problem as well.

Related Solutions

Solved – Whether to use Spearman’s rho or multiple regression to examine relationship between two Likert scales

The difference between significant and non-significant is not necessarily significant. In general, you shouldn't look at the p-value to decide which test to use (for article length treatment see Gelman and Stern (2006)).
If you are concerned with developing a predictive model, then regression would be suitable and correlation would not be. If you are concerned with summarising bivariate association, then some form of correlation coefficient would be suitable.
The p-value of a linear regression with a single predictor will be the same as the p-value for Pearson's correlation. Given that you have only one predictor, then this applies to you. Thus, your question could be rephrased in terms of whether to use Pearson or Spearman's correlation coefficient. This has been discussed elsewhere on this site such as here.
For typical Likert scales there is a limit on the degree to which extreme outliers can occur. As such Pearson and Spearman often give similar results. That said, a little movement in p-values is not surprising, and if this just happens to cross the magical .05 line then it may appear more substantial than it really is.

PCA – Combining Items with Different Response Scales into Composite Scales: Advice and Literature

This is a great question!

I think that in scale construction, there's a delicate balance between interpretability and psychometric considerations. Specifically, a scale sum or average is much easier to grasp than a sum or average taken of standardized or otherwise re-scaled items.

However, there can be a somewhat subtle psychometric reason for re-scaling items prior to creating your scale composite (i.e., taking a sum or average). If your items have radically different standard deviations, the reliability of your composite scale will be decreased simply because of these differing standard deviations.

One way to understand this intuitively is to realize that, as you point out, items with widely varying standard deviations are assigned different weights in the composite. So, measurement error in the item with the greater standard deviation will tend to dominate the scale composite. In effect, having widely varying standard deviations reduces the very benefit that you're trying to accrue by averaging together multiple items (i.e., normally, averaging together multiple items reduces the impact of measurement error from any one of the component items).

I have created a demonstration of the effects of a single dominant item in some simulated data below. Here I create five correlated items and find the reliability (measured with Cronbach's alpha) of the resultant scale.

require(psych)

# Create data
set.seed(13105)
item1 <- round(rnorm(100, sd = 3), digits = 0)
item2 <- round(item1 + rnorm(100, sd = 1), digits = 0)
item3 <- round(item1 + rnorm(100, sd = 1), digits = 0)
item4 <- round(item1 + rnorm(100, sd = 1), digits = 0)
item5 <- round(item1 + rnorm(100, sd = 1), digits = 0)

d <- data.frame(item1, item2, item3, item4, item5)

# Cronbach's alpha
alpha(d)

Reliability analysis   
Call: alpha(x = d)

  raw_alpha std.alpha G6(smc) average_r  mean  sd
       0.97      0.97    0.97      0.87 -0.14 2.5

Reliability if an item is dropped:
      raw_alpha std.alpha G6(smc) average_r
item1      0.96      0.96    0.94      0.84
item2      0.97      0.97    0.96      0.88
item3      0.97      0.97    0.96      0.89
item4      0.97      0.97    0.96      0.88
item5      0.96      0.97    0.96      0.87

 Item statistics 
        n    r r.cor r.drop  mean  sd
item1 100 0.98  0.99   0.97 -0.10 2.5
item2 100 0.94  0.92   0.90 -0.27 2.8
item3 100 0.93  0.91   0.89 -0.09 2.7
item4 100 0.94  0.92   0.91 -0.19 2.6
item5 100 0.94  0.93   0.91 -0.06 2.7

And here I change the standard deviation of item2 by multiplying the item by $5$. Note the dramatic drop in Cronbach's alpha due to this procedure. Also note that multiplying an item by a positive constant does not affect the correlation matrix constructed with these five items in the slightest. The only thing that I have done by multiplying item2 by $5$ is that I have changed the scale on which item2 is measured, and yet changing this scale greatly impacts the reliability of the composite.

# Re-scale item 2 to have a much larger standard deviation than the other items
d$item2 <- d$item2 * 5

# Cronbach's alpha
alpha(d)

Reliability analysis   
Call: alpha(x = d)

  raw_alpha std.alpha G6(smc) average_r  mean  sd
       0.74      0.97    0.97      0.87 -0.36 4.7

Reliability if an item is dropped:
      raw_alpha std.alpha G6(smc) average_r
item1      0.68      0.96    0.94      0.84
item2      0.97      0.97    0.96      0.88
item3      0.69      0.97    0.96      0.89
item4      0.68      0.97    0.96      0.88
item5      0.68      0.97    0.96      0.87

  Item statistics 
        n    r r.cor r.drop  mean   sd
item1 100 0.98  0.99   0.96 -0.10  2.5
item2 100 0.94  0.92   0.90 -1.35 13.9
item3 100 0.93  0.91   0.86 -0.09  2.7
item4 100 0.94  0.92   0.89 -0.19  2.6
item5 100 0.94  0.93   0.90 -0.06  2.7

Best Answer

Related Solutions

Solved – Whether to use Spearman’s rho or multiple regression to examine relationship between two Likert scales

PCA – Combining Items with Different Response Scales into Composite Scales: Advice and Literature

Related Question