Solved – How to combine likert items into a single variable

I'm doing my first research and collected quite a large number of response using a questionnaire.

In this questionnaire, the dependent variable is assessed using 4 questions with a 5-Likert scale (going from "totally disagree" given value 1 up to "totally agree given value 5). There are multiple independent variables, also assessed using multiple questions with a 5-Likert scale. I also have some moderators, again assessed using multiple question with a 5-Likert scale. At last, there are also some control variables such as age.

Now my question is: How do I combine the different questions so that I have one value describing the dependent, independent, … variables? If I combine these scales, do the resulting values become continuous? Also, if I create the interaction term with the values obtained by combining the different questions, should I center this interaction term? And finally, which analysis should I use?

Currently, I've done following: I've combined all the different questions (5-Likert scale) describing a dependent variable or an independent variable by calculating the mean of the responses. To do my analysis, I used a hierarchical multiple regression. However, I'm absolutely not sure if this is the way to do it correctly?

Best Answer

You should clarify what you mean by "combined all the likert scale questions". If you asked 5 likert scale questions that together are a conventional way to measure e.g. self-confidence, then those should be combined into one variable (provided some conditions described below are met). If you asked 5 likert scale questions for self-confidence, 7 others for narcissism and 12 others for empathy, then don't combine all those into one variable obviously.

Phrasing is very important with those likert scale questions. Don't invent new questions unless you absolutely have to. You will almost always find well established sets of questions for your purpose. Copy paste them into your questionnaire and cite the researcher that built the scale.
Likert scales are in principle not continuous scales which means that you shouldn't do t-tests or ANOVAs on them. The problem is that respondents may not find it obvious that the distance between "moderately" and "much" is the same as between "much" and "very much" etc. So you need to make it obvious to them by visually spacing the answer options equidistant and using numbers while saying e.g. 1=most and 10=least, not by putting subjective descriptions on all the answer options in between.
The question whether you should have an even or an odd number of options is controversial. When you have an odd number, people might be lazy and answer in the middle too often. If you have an even number, you may force them to reveal an "opinion" where they truly don't have one.

If that is taken care of, you still need to assess the internal validity of the scale according to your actual response data.

Have attention checks to sort out respondents that answered randomly.
Do a cronbach $\alpha$ or a similar measure to see that all questions are sufficiently aligned.
Do a PCA to see that the scale really measures a unidimensional quantity. Most of the variability should be in just one principal component.

If that is done, most people will just do an arithmetic mean of the responses to have a continuous variable. This assumes that every likert question on the scale is equally important. Having just done the PCA, you might also take the respondents scores on the first principal component. This gives different weights to different questions according to their capability to differentiate respondents in your data-set. That last option is seldom done.

Best Answer

Related Solutions

Solved – Likert scale questionnaire and logistic regression

Solved – Composite variable from Likert items

Related Question