Describing your system for forming a composite
You have weighted a set of items using a standard system (i.e., equal distance between numbers for categories [0,1,2,3,4,5], the same scale for each item; items scores then summed to form a scale score).
I would not call the above system "subjective".
It is probably the most common system for scoring psychology and social science multi-item scales when items use the same response scale.
I imagine that you are contrasting such a scoring system with one based on a factor analysis or a related procedure.
Validity of your composite
You are adopting a standard scoring system, and there are good reasons why this system is so common.
- It's very easy to communicate to others how the scoring system works, and therefore can readily be applied in multiple contexts.
- It is not specific to a given sample (in contrast to weightings derived from a factor analysis; although such weightings could be fixed in one sample and applied in others).
- Many scales are designed so that each item is designed to measure the scale and therefore summing over these items is designed to measure the construct.
- Because each item is on the same scale (and the number of scale points is relatively small), the standard deviation for items tend to be fairly similar, and thus the contribution of the item to the scale total tends to be fairly similar.
Nonetheless, such a scoring system is predicated on the idea that each item is a good measure of the underlying construct.
The broader issue of validity relates to whether the scale you have created is a valid measure of whatever it is meant to be a measure of.
There are a wide range of procedures that people use to establish the validity of a given scale both in general, and for a specific sample.
- Factor analysis and reliability analysis are two obvious ways to assess the internal structure of a scale or set of scales.
- Correlating the scale with other measures is another strategy.
Much more could be said about validity and scale construction (see here for some references).
Whether to convert items to Z-scores
Standardising items first before creating a scale mean or sum is one simple way of ensuring that each item has equivalent "importance" in forming a composite.
The problem of unequal importance is more of a problem when combining component variables that are on very different scales (e.g., height in mm, weight in kg).
There are also issues in comparability across studies when adopting a sample specific z-score approach.
In your case, whether you convert each item to a z-score first before summing items or whether you just sum items, my guess is that the two variables will correlate very highly (perhaps greater than .95, but you can check this).
In general the simple sum (or mean) of raw scores is preferable from a comparability perspective. It also communicates how the mean relates to the underlying scale (e.g., a mean of 4.2 on a 1 to 5 scale where 5 means very satisfied indicates that the sample is generally satisfied, whereas the sum of z-scores does not).
I started doing analytical work years ago with data from surveys/questionnaires (analyzing in SPSS), so can completely identify.
Your question is very broad, so here are some general recommendations:
Starting Point:
Quickly learn how to use your software's scripting / coding editor, and do all of your variable setup and transformation work there. This enables:
- Editing and rerunning as you inevitably decide to make changes to your setup
- Easy re-running of your setup if you add more adata, and
- Documentation for you and others down the road!
Entering the Data:
- Invest heavily in setting up your dataset. This stage is often very manual and tedious, but critical. Shortcuts here will cause pain down the road.
- Definitely transform your likert data into a ordinal integer scales (you can manually do it by summing up the component variables and scaling each choice by x1, x2, x3, up to xN possible scale choices.
- For likert data scales, pick one conceptual orientation and stick to it! E.g., if one variable is scaled 'strongly dislike' <--> 'strongly like' then a diet adherence scale should run 'did not adhere to diet'<----->'adhered to diet', not the other way around.
- Be sure to use your software's functionality for adding variable value labels - they'll help your analysis and output tremendously.
Aggregate values for variables like age into broader bands to simplify your analysis output. You have to use subject-matter expertise and judgment given the study to pick the bands. But also consider resulting sample counts - a band with 1 person in it isn't very useful.
Be careful of reverse wording. It's a good technique from a survey/response validity point of view, but trips up analysis. Recode the values of such questions to match the conceptual orientation of other likert scales per the point above.
Always preserve your original dataset,and any variable transforms often should be new variables themselves.
Analysis
- Build your analysis plan from a list of hypotheses you're trying to validate/disprove; if you don't have a list of hypotheses a list of questions is almost as good. Another technique is to storyboard a report / presentation you'd like to make with generic points, then start trying to build those pages/points by seeing what the data suggests.
- Non-parametric can mean a few different things, but a good starting point are basic statistics like cross-tabs, t-tests, chi-square tests. Descriptively explore your data first, then explore the relations you find via crosstabs. You can do many tests of association, but read your software's documentation to verify whether the test is for nominal/categorical data or ordinal scale data.
- Most software will let you apply things like linear regression to ordinal data once it's in that form - you just have to be careful. This could make sense for your likert scale data - or better yet use ordinal regression.
- If doing any form of regression just make sure that any multi-value variable you're using is truly a scale value. If it's categorical value without scale then transform the values into dummy (dichotomous) variables and use those instead.
- Always keep sample size top of mind.
Sounds like you're in SPSS - a quick google search turned up this link you might find useful for SPSS and likert scale data: http://www.uni.edu/its/support/article/604#chi
Best Answer
If you add 20 Likert items (not technically exactly correct to do, but often done anyway) then the resulting sum can take so many values that you can probably treat it as continuous and do t-tests, using the Satterthwaite correction for unequal variances if needed. Or, if you want to compare more than two groups at a time, you can do regression or ANOVA
However, if each group has 4 or 5 people, it is going to be very hard to find statistically significant differences. The effect size would have to be very large.