Assuming equal $n$s [but see note 2 below] for each treatment in a one-way layout, and that the pooled SD from all the groups is used in the $t$ tests (as is done in usual post hoc comparisons), the maximum possible $p$ value for a $t$ test is $2\Phi(-\sqrt{2}) \approx .1573$ (here, $\Phi$ denotes the $N(0,1)$ cdf). Thus, no $p_t$ can be as high as $0.5$. Interestingly (and rather bizarrely), the $.1573$ bound holds not just for $p_F=.05$, but for any significance level we require for $F$.
The justification is as follows: For a given range of sample means, $\max_{i,j}|\bar y_i - \bar y_j| = 2a$, the largest possible $F$ statistic is achieved when half the $\bar y_i$ are at one extreme and the other half are at the other. This represents the case where $F$ looks the most significant given that two means differ by at most $2a$.
So, without loss of generality, suppose that $\bar y_.=0$ so that $\bar y_i=\pm a$ in this boundary case. And again, without loss of generality, suppose that $MS_E=1$, as we can always rescale the data to this value. Now consider $k$ means (where $k$ is even for simplicity [but see note 1 below]), we have $F=\frac{\sum n\bar y^2/(k-1)}{MS_E}= \frac{kna^2}{k-1}$. Setting $p_F=\alpha$ so that $F=F_\alpha=F_{\alpha,k-1,k(n-1)}$, we obtain $a =\sqrt{\frac{(k-1)F_\alpha}{kn}}$. When all the $\bar y_i$ are $\pm a$ (and still $MS_E=1$), each nonzero $t$ statistic is thus $t=\frac{2a}{1\sqrt{2/n}} = \sqrt{\frac{2(k-1)F_\alpha}{k}}$. This is the smallest maximum $t$ value possible when $F=F_\alpha$.
So you can just try different cases of $k$ and $n$, compute $t$, and its associated $p_t$. But notice that for given $k$, $F_\alpha$ is decreasing in $n$ [but see note 3 below]; moreover, as $n\rightarrow\infty$, $(k-1)F_{\alpha,k-1,k(n-1)} \rightarrow \chi^2_{\alpha,k-1}$; so $t \ge t_{min} =\sqrt{2\chi^2_{\alpha,k-1}/k}$. Note that $\chi^2/k=\frac{k-1}k \chi^2/(k-1)$ has mean $\frac{k-1}k$ and SD$\frac{k-1}k\cdot\sqrt{\frac2{k-1}}$. So $\lim_{k\rightarrow\infty}t_{min} = \sqrt{2}$, regardless of $\alpha$, and the result I stated in the first paragraph above is obtained from asymptotic normality.
It takes a long time to reach that limit, though. Here are the results (computed using R
) for various values of $k$, using $\alpha=.05$:
k t_min max p_t [ Really I mean min(max|t|) and max(min p_t)) ]
2 1.960 .0500
4 1.977 .0481 <-- note < .05 !
10 1.840 .0658
100 1.570 .1164
1000 1.465 .1428
10000 1.431 .1526
A few loose ends...
- When k is odd: The maximum $F$ statistic still occurs when the $\bar y_i$ are all $\pm a$; however, we will have one more at one end of the range than the other, making the mean $\pm a/k$, and you can show that the factor $k$ in the $F$ statistic is replaced by $k-\frac 1k$. This also replaces the denominator of $t$, making it slightly larger and hence decreasing $p_t$.
- Unequal $n$s: The maximum $F$ is still achieved with the $\bar y_i = \pm a$, with the signs arranged to balance the sample sizes as nearly equally as possible. Then the $F$ statistic for the same total sample size $N = \sum n_i$ will be the same or smaller than it is for balanced data. Moreover, the maximum $t$ statistic will be larger because it will be the one with the largest $n_i$. So we can't obtain larger $p_t$ values by looking at unbalanced cases.
- A slight correction: I was so focused on trying to find the minimum $t$ that I overlooked the fact that we are trying to maximize $p_t$, and it is less obvious that a larger $t$ with fewer df won't be less significant than a smaller one with more df. However, I verified that this is the case by computing the values for $n=2,3,4,\ldots$ until the df are high enough to make little difference. For the case $\alpha=.05, k\ge 3$ I did not see any cases where the $p_t$ values did not increase with $n$. Note that the $df=k(n-1)$ so the possible df are $k,2k,3k,\ldots$ which get large fast when $k$ is large. So I'm still on safe ground with the claim above. I also tested $\alpha=.25$, and the only case I observed where the $.1573$ threshold was exceeded was $k=3,n=2$.
Best Answer
Penalized maximum likelihood estimation is a good approach that leads to enhanced ability to take a point estimate out of context and have it not be overstated. For example, if one selects the group whose mean is farthest from the others, the result will be significantly biased, and penalization reduces this bias. James-Stein estimators also work, and the best of all approaches is a Bayesian hierarchical model because that is one of the few methods that allows full statistical inference to be carried out in the presence of shrinkage.