From reading your previous post, I see that you have two groups with 15 subjects, each with multiple observations (3 each). Each subject appears in each group, except subject 15 who has 0 observation in group 1.
So, basically, you have a paired design. A way to test whether Group 1 and Group 2 are different is by using a paired wilcoxon signed rank sum test. In R, this can be done using the following code:
df<- structure(list(Group = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), Subject = c(1L,
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L,
6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 10L, 10L, 10L, 11L, 11L,
11L, 12L, 12L, 12L, 13L, 13L, 13L, 14L, 14L, 14L, 1L, 1L, 1L,
2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L,
7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 10L, 10L, 10L, 11L, 11L, 11L,
12L, 12L, 12L, 13L, 13L, 13L, 14L, 14L, 14L, 15L, 15L, 15L),
Value = c(29.89577946, 29.51885854, 29.77429604, 33.20695108,
32.09027292, 31.90909894, 30.88358173, 30.67547731, 30.82494595,
31.70128247, 31.57217504, 31.61359752, 30.51371055, 30.42241945,
30.44913954, 26.90850496, 0, 0, 0, 0, 0, 28.94047335, 29.27188604,
29.78511206, 28.18475423, 27.54266717, 26.99873401, 29.26941344,
28.50457189, 28.78050443, 31.39038527, 31.19237052, 30.74053275,
28.68618888, 28.42109545, 28.58222544, 28.99337177, 29.31797,
28.4541501, 28.18475423, 27.54266717, 26.99873401, 28.07576794,
28.96344894, 28.48358437, 27.02527663, 27.1308483, 26.96091103,
27.04019758, 27.51900858, 28.14559621, 26.83569136, 26.90724462,
26.82675, 0, 0, 0, 27.62449786, 26.82335228, 26.66925534,
0, 25.81254792, 26.61666776, 26.12545858, 0, 0, 0, 0, 0,
28.84580419, 29.11003424, 29.24723895, 28.72919768, 29.70673437,
29.31274377, 30.73133587, 30.44805655, 30.61561583, 27.06896964,
27.04249553, 27.15990629, 31.54738209, 31.51643714, 31.8055509,
31.291867, 31.89146186, 31.65812735)), .Names = c("Group",
"Subject", "Value"), class = "data.frame", row.names = c(NA,
-87L))
df$Value[df$Value == 0] <- NA
df[is.na(df$Value),] ## missing data
table(df$Group, df$Subject) ## check to see if all groups have equal obs
## perform wilcoxon signed rank sum test
wilcox.test(formula = Value ~ Group, data = df[!df$Subject == 15,]) ## omit the 15th patient
Wilcoxon rank sum test with continuity correction
data: Value by Group
W = 900, p-value = 0.0006732
alternative hypothesis: true location shift is not equal to 0
Warning message:
In wilcox.test.default(x = c(29.89577946, 29.51885854, 29.77429604, :
cannot compute exact p-value with ties
## we can reject the null hypothesis that both groups are equal
From the R documentation,
If exact p-values are available, an exact confidence interval is obtained by the algorithm described in Bauer (1972), and the Hodges-Lehmann estimator is employed. Otherwise, the returned confidence interval and point estimate are based on normal approximations. These are continuity-corrected for the interval but not the estimate (as the correction depends on the alternative).
Best Answer
Your data is not suitable for neither ANOVA/Kruskal-Wallis nor correlation analysis. What you need is a Chi-square test for independence to determine whether the operating system is related to browser preference at all.
To study specific relations, you could then make something like a hierarchical graph (or a tree) where all OS's (level 1 nodes) are linked to all of your browsers (level 2 nodes). The weight of an edge on the graph would indicate the strength of a connection between an OS and a browser. If you represent this graph as a matrix, you can easily normalise your connection strength in the unit interval [0,1] by dividing every element in your matrix by its largest element.