Solved – Test for comparing “mean of means”

hypothesis testingmeanpoolingstandard deviation

I have a treatment and a control group. Both groups consist of several teams, each with several individuals. When I want to compare the performance between the groups, I could run a t-test or Wilcoxon ranksum test on the individual values within both groups, and that's fine.

1) However, because those teams are very different in size, the "mean of means" looks more like what I get in parametric regressions later. With "mean of mean" I refer to first taking the mean of individual values within the team, then take the mean of team means within the group. Denote by $t$ team, $i$ individual within the group, then
$$\text{mean of means}=1/T\sum_{t=1}^{T} (1/n_t \sum_{i=1}^{n_{t}} x_{it}).$$ The question is: are there tests that allow me to test the hypothesis that the two "mean of means" are equal?

2) If not, how would you compute the standard deviation of the mean of means? Clearly, just using the SD of the team means wastes information. In wikipedia I came across pooled standard deviations, which seem to be suitable in this context. Would it be valid to construct standard errors based on these pooled SDs and use them for the hypothesis test?

Thank you very much!

Best Answer

That's a nested design - teams are nested under groups - and you probably want to consider 'team' as a random rather than a fixed effect. So look at hierarchical mixed models.

Related Solutions

Solved – Using bootstrap under H0 to perform a test for the difference of two means: replacement within the groups or within the pooled sample

Here is my take on it, based on chapter 16 of Efron's and Tibshirani's An Introduction to the bootstrap (page 220-224). The short of it is that your second bootstrap algorithm was implemented wrongly, but the general idea is correct.

When conducting bootstrap tests, one has to make sure that the re-sampling method generates data that corresponds to the null hypothesis. I'll use the sleep data in R to illustrate this post. Note that I am using the studentized test statistic rather than just the difference of means, which is recommended by the textbook.

The classical t-test, which uses an analytical result to obtain information about the sampling distribution of the t-statistic, yields the following result:

x <- sleep$extra[sleep$group==1]
y <- sleep$extra[sleep$group==2]
t.test(x,y)
t = -1.8608, df = 17.776, p-value = 0.07939

One approach is similar in spirit to the more well-known permutation test: samples are taken across the entire set of observations whilst ignoring the grouping labels. Then the first $n1$ are assigned to the first group and the remaining $n2$ to the second group.

# pooled sample, assumes equal variance
pooled <- c(x,y)
for (i in 1:10000){
  sample.index <- sample(c(1:length(pooled)),replace=TRUE)
  sample.x <- pooled[sample.index][1:length(x)]
  sample.y <- pooled[sample.index][-c(1:length(y))]
  boot.t[i] <- t.test(sample.x,sample.y)$statistic
}
p.pooled <-  (1 + sum(abs(boot.t) >= abs(t.test(x,y)$statistic))) / (10000+1) 
p.pooled
[1] 0.07929207

However, this algorithm is actually testing whether the distribution of x and y are identical. If we are simply interested in whether or not their population means are equal, without making any assumptions about their variance, we should generate data under $H_0$ in a slightly different manner. You were on the right track with your approach, but your translation to $H_0$ is a bit different from the one proposed in the textbook. To generate $H_0$ we need to subtract the first group's mean from the observations in the first group and then add the common or pooled mean $\bar{z}$. For the second group we do the same thing.

$$ \tilde{x}_i = x_i - \bar{x} + \bar{z} $$ $$ \tilde{y}_i = y_i - \bar{y} + \bar{z}$$

This becomes more intuitive when you calculate the means of the new variables $\tilde{x}/\tilde{y}$. By first subtracting their respective group means, the variables become centred around zero. By adding the overall mean $\bar{z}$ we end up with a sample of observations centred around the overall mean. In other words, we transformed the observations so that they have the same mean, which is also the overall mean of both groups together, which is exactly $H_0$.

# sample from H0 separately, no assumption about equal variance
xt <- x - mean(x) + mean(sleep$extra)
yt <- y - mean(y) + mean(sleep$extra)

boot.t <- c(1:10000)
for (i in 1:10000){
  sample.x <- sample(xt,replace=TRUE)
  sample.y <- sample(yt,replace=TRUE)
  boot.t[i] <- t.test(sample.x,sample.y)$statistic
}
p.h0 <-  (1 + sum(abs(boot.t) >= abs(t.test(x,y)$statistic))) / (10000+1) 
p.h0
[1] 0.08049195

This time around we ended up with similar p-values for the three approaches.

Solved – Is this a valid way to calculate an SD of a pooled sample (NOT a pooled SD)

I would describe the quantity you are looking for as the marginal or unconditional variance (not conditioning on the mean of each of the subgroups). As opposed to the conditional or average residual variance (conditioning on the mean of each subgroup). You can derive this theoretically using the law of total variance: $$ \begin{align*} Var(X) & = E(Var(X|group)) + Var(E(X|group)) \\ & = \sum_{g \in group} \left( P(G) Var(X_g) \right) + 1/N_{groups} \sum_{g \in group} \left(E(X_g) - E(X)\right)^2, \end{align*} $$ where $E(X) = \sum_{g \in group} E(X_g) P(G)$ is the marginal mean and $P(G)$ is the probability of belonging to group $G$ (which you might estimate using the observed counts of each group).

Best Answer

Related Solutions

Solved – Using bootstrap under H0 to perform a test for the difference of two means: replacement within the groups or within the pooled sample

Solved – Is this a valid way to calculate an SD of a pooled sample (NOT a pooled SD)

Related Question