I'm doing my first research and collected quite a large number of response using a questionnaire.
In this questionnaire, the dependent variable is assessed using 4 questions with a 5-Likert scale (going from "totally disagree" given value 1 up to "totally agree given value 5). There are multiple independent variables, also assessed using multiple questions with a 5-Likert scale. I also have some moderators, again assessed using multiple question with a 5-Likert scale. At last, there are also some control variables such as age.
Now my question is: How do I combine the different questions so that I have one value describing the dependent, independent, … variables? If I combine these scales, do the resulting values become continuous? Also, if I create the interaction term with the values obtained by combining the different questions, should I center this interaction term? And finally, which analysis should I use?
Currently, I've done following: I've combined all the different questions (5-Likert scale) describing a dependent variable or an independent variable by calculating the mean of the responses. To do my analysis, I used a hierarchical multiple regression. However, I'm absolutely not sure if this is the way to do it correctly?
Best Answer
You should clarify what you mean by "combined all the likert scale questions". If you asked 5 likert scale questions that together are a conventional way to measure e.g. self-confidence, then those should be combined into one variable (provided some conditions described below are met). If you asked 5 likert scale questions for self-confidence, 7 others for narcissism and 12 others for empathy, then don't combine all those into one variable obviously.
If that is taken care of, you still need to assess the internal validity of the scale according to your actual response data.
If that is done, most people will just do an arithmetic mean of the responses to have a continuous variable. This assumes that every likert question on the scale is equally important. Having just done the PCA, you might also take the respondents scores on the first principal component. This gives different weights to different questions according to their capability to differentiate respondents in your data-set. That last option is seldom done.