Solved – how can i compare two groups of data

hypothesis testingmethod-comparisonmultiple-comparisonsrstatistical significance

I asked a question here how can i see if my data are coming from two different population? and seems like I cannot correctly communicate and the person who is writing to me is making me confused.

So I try to explain what I want , if you know any method, please just tell me the method, I will try to do it myself.

I have two groups
Group A with n number of samples
Group B with m number of samples

The measurements have replicate

I want to see if the Group A is different from Group B.
for example by the mean, or whatever else which makes statistically different or not different.

Is there someone who can really guide me what to do?
Many thanks
Nik

Best Answer

From reading your previous post, I see that you have two groups with 15 subjects, each with multiple observations (3 each). Each subject appears in each group, except subject 15 who has 0 observation in group 1.

So, basically, you have a paired design. A way to test whether Group 1 and Group 2 are different is by using a paired wilcoxon signed rank sum test. In R, this can be done using the following code:

df<- structure(list(Group = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                              1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                              1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                              1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
                              2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
                              2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), Subject = c(1L, 
                                                                                                   1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 
                                                                                                   6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 10L, 10L, 10L, 11L, 11L, 
                                                                                                   11L, 12L, 12L, 12L, 13L, 13L, 13L, 14L, 14L, 14L, 1L, 1L, 1L, 
                                                                                                   2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 
                                                                                                   7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 10L, 10L, 10L, 11L, 11L, 11L, 
                                                                                                   12L, 12L, 12L, 13L, 13L, 13L, 14L, 14L, 14L, 15L, 15L, 15L), 
                    Value = c(29.89577946, 29.51885854, 29.77429604, 33.20695108, 
                              32.09027292, 31.90909894, 30.88358173, 30.67547731, 30.82494595, 
                              31.70128247, 31.57217504, 31.61359752, 30.51371055, 30.42241945, 
                              30.44913954, 26.90850496, 0, 0, 0, 0, 0, 28.94047335, 29.27188604, 
                              29.78511206, 28.18475423, 27.54266717, 26.99873401, 29.26941344, 
                              28.50457189, 28.78050443, 31.39038527, 31.19237052, 30.74053275, 
                              28.68618888, 28.42109545, 28.58222544, 28.99337177, 29.31797, 
                              28.4541501, 28.18475423, 27.54266717, 26.99873401, 28.07576794, 
                              28.96344894, 28.48358437, 27.02527663, 27.1308483, 26.96091103, 
                              27.04019758, 27.51900858, 28.14559621, 26.83569136, 26.90724462, 
                              26.82675, 0, 0, 0, 27.62449786, 26.82335228, 26.66925534, 
                              0, 25.81254792, 26.61666776, 26.12545858, 0, 0, 0, 0, 0, 
                              28.84580419, 29.11003424, 29.24723895, 28.72919768, 29.70673437, 
                              29.31274377, 30.73133587, 30.44805655, 30.61561583, 27.06896964, 
                              27.04249553, 27.15990629, 31.54738209, 31.51643714, 31.8055509, 
                              31.291867, 31.89146186, 31.65812735)), .Names = c("Group", 
                                                                                "Subject", "Value"), class = "data.frame", row.names = c(NA, 
                                                                                                                                         -87L))



df$Value[df$Value == 0] <- NA

df[is.na(df$Value),] ## missing data

table(df$Group, df$Subject) ## check to see if all groups have equal obs


## perform wilcoxon signed rank sum test 
wilcox.test(formula = Value ~ Group, data = df[!df$Subject == 15,]) ## omit the 15th patient

Wilcoxon rank sum test with continuity correction

data:  Value by Group
W = 900, p-value = 0.0006732
alternative hypothesis: true location shift is not equal to 0

Warning message:
In wilcox.test.default(x = c(29.89577946, 29.51885854, 29.77429604,  :
                               cannot compute exact p-value with ties

## we can reject the null hypothesis that both groups are equal

From the R documentation,

If exact p-values are available, an exact confidence interval is obtained by the algorithm described in Bauer (1972), and the Hodges-Lehmann estimator is employed. Otherwise, the returned confidence interval and point estimate are based on normal approximations. These are continuity-corrected for the interval but not the estimate (as the correction depends on the alternative).