Solved – How to compute Cronbach’s alpha with only one measurement per subject

cronbachs-alphareliability

Consider the following scenario. I ran an experiment with 200 trials, each with a different stimulus. Each subject did exactly 200 trials. The subject responds with a single number anywhere between 5 and 50. The correct answer also ranges between 5 – 50. For each subject that did the experiment, I compute a single value, $V$, for that person. This value $V$ uses computations that use the expected answer and the observed answer for each trial. That is all the experiment does. It allows me to find $V$ for a subject.

I was asked to find Cronbach's alpha for this particular experiment. How do I do it? I see the formula on Wikipedia, however the denominator is $K-1$ and in my case, $K=1$, so how would I find Cronbach's Alpha? Should I find a single Cronbach's alpha value for the experiment as a whole, or is each subject getting an alpha value? I couldn't find many good online resources to learn about Cronbach's alpha value, so if anyone has any good suggestions, I would love to see links to places I can learn this stuff. Thanks!

I will be using R, and I have a CSV file where column 1 is the subject, column 2 is the expected answer, and column 3 is the subjects answer. There are 200 rows.

Best Answer

The formula for Cronbach's alpha is: $$ \alpha =\frac{K}{K-1} ( 1-\frac{\sum_{i=1}^{K}\sigma^2_{Y_i}}{\sigma^2_X}) $$ Here, K is the number of different items you administered to each subject. Sometimes items are different questionnaire items designed to measure the same underlying construct. In your case, it sounds like each item is a separate run of the experiment.

In order to calculate Cronbach's alpha, you need to put your data in "wide format" (as Michelle mentioned). This means that each of the 200 measurements needs to have its own column/variable. So your columns would be SubjectID, Answer1, Answer2, ... , Answer 200.

I'm not sure where your expected answer column comes in. Cronbach's alpha is used to test the consistency of answers to each other, not to some true value, because the true value being measured is latent (i.e., unknown). I suppose you could calculate the correlation between the mean of the answers and the expected answer to see how well people did. Or calculate the times they answered exactly the way you expected and call that their score.

As has been suggested in the comments, it doesn't seem to make sense to calculate alpha on the raw scores. If they are only meaningful in relation to the expected answer (i.e., as indicative of how well a subject does on this particular test), you need to use adjusted scores that actually quantify how well a subject did. If an answer of 50 with an expected answer of 25 doesn't mean the same thing as an answer of 50 with an expected answer of 45, then calculating alpha on the raw scores is meaningless.

To calculate Cronbach's alpha using R, read the CSV file into a dataframe, reformat into wide format, then run cronbach.alpha on only the answer columns (assuming your columns for subject and values are called SubjectID and Score):

x.long <- read.csv(file="myfile.csv")
library(reshape2)
x.wide <- dcast(x.long, SubjectID ~ Score)
library(ltm)
cronbach.alpha(x.wide[,-1]) # remove SubjectID in first column
Related Question