T-Test – Using Only Summary Data in a Box Plot: How to Conduct a T-Test

boxplott-test

I have the 5 number summary (min, Q1, Median, Q3, max) from two boxplots and wanted to test whether or not the averages of the groups in the two boxplots were significantly different.

I would like to do this using a t-test but I don't have the data available to me (just the 5 number summary).

Is there a way to test for differences of the mean? Or a crude approximation to a t-test? Also, I do know the sample size and the mean.

Best Answer

Since you have the sample means and your hypothesis relates to population means, I've assumed you'll definitely want to use the sample means in what follows.

With some distributional assumptions, you can certainly get somewhere.

If the sample sizes are quite large you could assume a distribution in order to scale the IQRs to an estimate of $\sigma$ and just treat it as a z-test. (n=30 isn't really "large" though)

e.g. if you assume normality, the population interquartile range is about 1.35$\sigma$, so if the sample is large enough that the population IQR is estimated with little error, you can estimate $\sigma$ and have an effective test at the normal.

In this case, if you don't assume equal variances, then you get $\tilde{\sigma_i}=\text{IQR}_i/1.35$, then calculate $\tilde{\sigma}_D^2 = \tilde{\sigma}_1^2/n_1+\tilde{\sigma}_2^2/n_2$ and then take $z^* = \frac{\bar{x}_1-\bar{x}_2}{\tilde{\sigma}_D}$ and look up z-tables.

[By way of a check, I just did a simulation where I generated normal samples of size 30 (with equal variance, though I didn't assume it in the calculation), and the test is anticonservative (i.e. the type I error rate is higher than nominal), so when you attempt to do a 5% test it looks like you're actually getting somewhere in the region of 6.8% (the approximation will likely be a bit worse if the variances differ). If you can tolerate that, then that's probably fine. Of course you could lower the significance level to compensate for the anticonservatism but I'd be inclined to bite the bullet and try option 2. Once sample sizes hit 200 or so, though, this works pretty well.]

If either sample size is not large, you can still do something, but the distribution of the statistic will depend on the exact method by which the quartiles were computed as well as the particular sample sizes.

In particular, you could either

a. assume equal variances and use a test statistic akin to an equal-variance t-statistic but with an estimate of $\sigma^2$ based on a weighted average of the squares of the two IQRs; or

b. not make an assumption of equal variance and use a test statistic more akin to a Welch-Satterthwaite type statistic.

In the first case the distribution of the test statistic could be obtained fairly simply by simulation from the assumed distribution. (In the second case things are a bit more complicated because the distribution will depend on the way the spreads differ -- but something could still be done.)

If you're not prepared to make some distributional assumption, you can still bound the sample standard deviation and so get upper and lower bounds on the t-statistic; however, the bounds may not be very narrow.

If you hadn't had the sample means, you could use the medians in an analog of the t-test. If you're assuming normality (or even just symmetry and existence of means) then the medians will estimate the respective means; however, since we only need to deal with the difference in means, substantially weaker assumptions will suffice for this to work as a test.

In this case you can get critical values (or indeed, p-values) via simulation quite easily, but the null distribution under a normal assumption is pretty close to t-distributed; a quite decent approximation to the p-value can be obtained from t-tables, but suitable degrees of freedom are substantially lower than you'd have from a t-test (close to half!) -- and the test statistic should be scaled as well, since the variances don't exactly correspond.

This won't have especially good power at the normal, but it will have good robustness to deviations from normality.

As an example, for a statistic of this form:

$t^* = \frac{\tilde{x}_1-\tilde{x}_2}{\sqrt{q_1^2/n+q_2^2/n}}$

where $\tilde{x_i}$ is the median of sample $i$ and $q_i$ is the interquartile range of sample $i$ (which is analogous to a particular form of two-sample t-test for equal variance and equal $n$). I simulated 40,000 samples of size 30 and 30.

A Q-Q plot of absolute values of $t^*$ vs absolute values of quantiles of $c\cdot t_{40}$ (for $c=1.064$) is plotted below (grey), and the 45 degree line is drawn in green. The second plot shows detail in the region of typical significance levels (including, but not limited to values between 1% and 10%). The approximation is accurate to about 3 figures over most of that range.

[Similar plots are obtained for a variety of other degrees of freedom in the vicinity (with suitably chosen $c$) for each. Simulations at a variety of sample sizes suggest that t-distribution approximations work well across a wide range of $n$ for the equal-variance equal-sample-size case. I expect approximation via t-distributions will be adequate for the equal-variance unequal-sample-size case, but the simulations and analysis required would take a more substantial amount of time.]

Best Answer

Related Solutions

Solved – Power of t-test for aggregate data

Boxplot – Comparing Box Plot Notches with Tukey-Kramer Intervals

Related Question