Which type of test is most appropriate for this comparative-descriptive study? I have two groups of subjects (250 in each). My goal is to determine if the two subject groups are different and if so, how. I'm using an online survey of 50 statements (embedded are 5 statements about 10 different possible factors). Subjects rate these statements on a Likert scale from "strongly agree" to "strongly disagree". Do I have to use a t-test or a logistic regression to determine categorical differences, and predictors?
Solved – Logistic regression versus t-test, which is better for this comparative-descriptive study
likertordinal-dataregressiont-test
Related Solutions
Adilah,
Attitudes toward online reading can be assessed with components 1, 3, 4, and 5. Attitude cannot be assessed with component 2 because this component represents self-behavior and the response options for component two are about frequency of behaviors, not attitudes.
Before assessing attitudes, I recommend running a cronbach's alpha test of internal consistency on each set of questions representing the components. The outcomes will tell you whether responses to each question is adequately related to other questions within the same component. If one question does not seem to fit too well, consider dropping it from a component. Make sure that you reverse score negatively phrased items, if there are any, before running cronbach's alpha. Cronbach's alpha is found under Scale in the SPSS Analyze drop down list -- choose reliability analysis.
Next aggregate the outcomes within each component. In other words, sum the responses for each question and then divide by the number of questions. For example, if component 3 (anxiety) is composed of items/questions 1, 2, 3, 4, 5, 6, 7, 8, then for each participant add up all 8 scores and then divide by 8. This will give you an overall component score for anxiety. Your anxiety score can then be compared using factors like gender and race. Note that although the data are ordinal, as you pointed out, when you combine items into an overall score, it is appropriate to use parametric statistics with analyses on the overall component scores.
If you want to determine overall attitude you would need to repeat the above process combining all items except #2. Determining weak vs. strong influence of components on overall attitude could be tricky because an attitude in either the left or right direction can be equally strong. I suppose that components with overall scores closest to the mean (between disagree and agree) would qualify as those being the weakest because it suggests neutral attitude.
The trouble with weighting is that your results will be arbitrary. For example, if you organize your responses on a 1-6 scale (1 being strongly disagree and 6 being strongly agree), then you're saying that the "distance" between a 1 and a 2 is the same as the "distance" between a 2 and a 3. (Here I use "distance" to indicate difference or the gap between what one number represents and another number represents.)
What I would suggest is, depending on your analysis, looking into an ordered model of some sort. The ordering indicates that StD < D < SlD < SlA < A < StA, but doesn't specify how large the distance is between any two options. I prefer an "ordered logit model" and that should suffice for your analysis if you have a large enough sample (which it appears that you do). This will also let you see how other factors affect their response on the survey, if you have that sort of information (i.e. gender, time with the company, department, etc.) available.
In broad strokes, ordered logit is going to be a fancy regression method that works with categorical data that has an order but isn't necessarily equally spaced out. Regression (as you may remember) is a way to measure the association between two variables by saying if one variable changes by $X$ amount, we expect the other variable to change by $Y$ amount. (I know you haven't taken a stats class recently, so hopefully this elucidates some of the ideas. There should be information online about how to conduct this sort of analysis.)
Best Answer
If you have a rating scale (what you're incorrectly calling a Likert scale) with a lot of levels, say about 10, then it's sort of OK to treat the values of the scale as interval. Small scales should not be treated this way. Nevertheless, it sounds like you're actually generating a Likert scale, which is the aggregate, usually sum, of the collections of ratings that you've gathered, and not any individual rating (your question confuses that issue). In that case it's usually fine to just treat it as interval data.
You could use a non-parametric test or parametric test. It's going to come down to whether you think some kind of model can be fit to the data and you can estimate parameters of it. Does the central limit theorem suggest the data are normally distributed and the resulting distributions in line with that? If so, a t-test is probably fine. If not then perhaps a simple non-parametric test of some kind. I'd probably prefer in your case to bootstrap a confidence interval of the effect.
Regarding logistic regression, perhaps you were considering it because, while the full scale can be treated as interval, ordinal logistic regression could be used on an individual rating regardless of whether there are a large or small number of levels and it's designed specifically for that kind of thing. There's a fairly nice
ordinal
package inR
. You'd want the clm command, or maybe clmm command. A more specific question could help here.