First, alpha is a quantity for a scale (a set of items).
Ad 1: Strictly speaking, alpha only makes sense for metric items (which, I believe, you mean by numerical variables). However, it is often used on (sum) scales of ordinal items too (there the general rule of thumb is a sum of more than 7 items and more than 4 levels of the item). I believe this is bad practice though.
Ad 2: You can use alpha in this case. Generally, it is rarely a good idea to make a numeric variable ordinal.
Ad 3: I would not use it in this case (but mainly because they are ordinal).
If data are MCAR, one would like to find an unbiased estimated of alpha. This could possibly be done via multiple imputation or listwise deletion. However, the latter might lead to severe loss of data. A third way is something like pairwise deletion which is implemented via an na.rm
option in alpha()
of the ltm
package and in cronbach.alpha()
of the psych
package.
At least IMHO, the former estimate of unstandardized alpha with missing data is biased (see below). This is due to the calculation of the total variance $\sigma^2_x$ via var(rowSums(dat, na.rm = TRUE))
. If the data are centered around 0, positive and negative values cancel each other out in the calculation of rowSums
. With missing data, this leads to a bias of rowSums
towards 0 and therefore to an underestimation of $\sigma^2_x$ (and alpha, in turn). Contrarily, if the data are mostly positive (or negative), missings will lead to a bias of rowSums
towards zero this time resulting in an overestimation of $\sigma^2_x$ (and alpha, in turn).
require("MASS"); require("ltm"); require("psych")
n <- 10000
it <- 20
V <- matrix(.4, ncol = it, nrow = it)
diag(V) <- 1
dat <- mvrnorm(n, rep(0, it), V) # mean of 0!!!
p <- c(0, .1, .2, .3)
names(p) <- paste("% miss=", p, sep="")
cols <- c("alpha.ltm", "var.tot.ltm", "alpha.psych", "var.tot.psych")
names(cols) <- cols
res <- matrix(nrow = length(p), ncol = length(cols), dimnames = list(names(p), names(cols)))
for(i in 1:length(p)){
m1 <- matrix(rbinom(n * it, 1, p[i]), nrow = n, ncol = it)
dat1 <- dat
dat1[m1 == 1] <- NA
res[i, 1] <- cronbach.alpha(dat1, standardized = FALSE, na.rm = TRUE)$alpha
res[i, 2] <- var(rowSums(dat1, na.rm = TRUE))
res[i, 3] <- alpha(as.data.frame(dat1), na.rm = TRUE)$total[[1]]
res[i, 4] <- sum(cov(dat1, use = "pairwise"))
}
round(res, 2)
## alpha.ltm var.tot.ltm alpha.psych var.tot.psych
## % miss=0 0.93 168.35 0.93 168.35
## % miss=0.1 0.90 138.21 0.93 168.32
## % miss=0.2 0.86 110.34 0.93 167.88
## % miss=0.3 0.81 86.26 0.93 167.41
dat <- mvrnorm(n, rep(10, it), V) # this time, mean of 10!!!
for(i in 1:length(p)){
m1 <- matrix(rbinom(n * it, 1, p[i]), nrow = n, ncol = it)
dat1 <- dat
dat1[m1 == 1] <- NA
res[i, 1] <- cronbach.alpha(dat1, standardized = FALSE, na.rm = TRUE)$alpha
res[i, 2] <- var(rowSums(dat1, na.rm = TRUE))
res[i, 3] <- alpha(as.data.frame(dat1), na.rm = TRUE)$total[[1]]
res[i, 4] <- sum(cov(dat1, use = "pairwise"))
}
round(res, 2)
## alpha.ltm var.tot.ltm alpha.psych var.tot.psych
## % miss=0 0.93 168.31 0.93 168.31
## % miss=0.1 0.99 316.27 0.93 168.60
## % miss=0.2 1.00 430.78 0.93 167.61
## % miss=0.3 1.01 511.30 0.93 167.43
Best Answer
You have only weak to very weak correlations (and sometimes negative) between your variables. Your alpha value is negative surely because the mean of all the inter-item correlations is negative. Maybe you can use a factor analysis to check the factorial structure and correlations between the extracted factors? But given the data you provide, I think it will no be very helpful, except maybe if you have a theory to guide your interpretation of the results. Do you have a theory or prior results predicting that your variables should correlates positively (i.e. allowing the use of Cronbach's alpha)? If so, then your results are pretty strange...