Solved – Need help with normalizing data in SPSS

spss

I apologize in advance… there's a good amount of stuff to read before you get to my question. Maybe I could have saved a word or two here and there… but overall, the background (I think) was required in order to best illustrate the topic.

Ok, I recently purchased SPSS in support of an upcoming research analysis. I've not used SPSS before and would like to get some guidance/pointers for normalization.

BACKGROUND:

As part of this academic research, I plan to release a survey in the next few weeks.

For my dependent variable (and dependent variable only), I will first provide the survey respondents a 2-column matrix where the first column lists the "name of a process" (across n number of rows) and the second column includes a checkbox.

Let's say I have 10 default rows (i.e., processes) from which the participants check any processes (in column 2) in which they believe their team contributes to.

Example:
Survey participant #1 may check rows/processes: 1, 3, 7
Survey participant #2 may check rows/processes: 1, 4
Survey participant #3 may check rows/processes: 5, 7, 8, 9, 10
and so forth…

Using branching logic in the electronic survey, their selected processes (e.g., 1, 3, 7) will be passed into subsequent questions. Their themes are as follows:

Question #1 will require to indicate a frequency for how often a process may have changed. A survey participant will have to select a value — on a scale of 1 through 7 — for each process.

Question #2, #3, and #4 will ask something else (note: for the purpose of this blog question, it's not important what the actual questions read). For the 2nd
through 4th question, survey participants will have to provide answers of either "Yes" or "No". (… where 'yes' will be coded as "1"; and 'no' will be coded as "0"
in SPSS).

So far so good?

** Break **

Upon completion of the survey (I'm striving for several hundred participants), the data will be uploaded in SPSS. Now, again, all questions above are pertaining to a single dependent variable.

Based on the survey participants' answers, I'd like to create the following:

a) Individual score
b) Overall organizational score

Obviously, multiplication across the columns will not work… as soon as, e.g., as a participant selected "no" in either the 2nd, 3rd, and/or 4th question, that process would be scored as zero. I don't want that! So, addition is probably the better choice.

Example of how survey participant #1 may have scored:

1: | 5 | 1 | 0 | 1 | = 7
3: | 2 | 1 | 1 | 0 | = 4

7: | 1 | 0 | 0 | 1 | = 2

Thus, participant #1's score = 13 (7 + 4 + 2)

On the other hand, survey participant #3 may have scored:

5: | 3 | 1 | 0 | 0 | = 4
7: | 4 | 0 | 1 | 0 | = 5
8: | 2 | 0 | 0 | 0 | = 2

9: | 7 | 1 | 1 | 1 | = 10

10: | 3 | 1 | 0 | 1 | = 5

Thus, participant #3's score = 26 (4, 5, 2, 10, 5)

Now, participant's #3 example score of 26 is twice as high as #1's example score of 13. This is primarily due to the larger number of projects/processes that the third survey participant is involved in.

As I'm trying to measure some form of "health" of the organization, a larger total score (derived through a larger number of processes) could be viewed as a contributing factor to the "organization's health". In this example, this would be wrong!

In other words, only larger scores on the individual process-level [(1 vs. 3 vs. 7) or (5, 7, 8, 9, 10) may be indicative of an "unhealthy" process/organization. So, process #9 by itself (with a score of 10) might be a contributor.

So, my question is as follows: In SPSS, how can I normalize the data so that it won't make a difference as to how many processes a survey participant has selected (and thereby skewing the data)? Should I merely divide by the number of processes? So, 13/3 = 4.333 vs. 26/5 = 5.2

If so, how can I accomplish that in SPSS?

I hope the above makes sense.

Thank you,
EEH

Best Answer

Make new syntax file.

DESC VARIABLES= question1 question2 question3 etc /save.

This saved all the questions as new variables with a Z attached.

The following computes a single scale that you can use for analysis:

COMPUTE meanscore = mean(Zquestion1, Zquestion2, Zquestion3 etc)

Related Question