Two-Sample t-tests – Comparison of Manual and Computer-Calculated Methods

confidence intervalp-valuer

I want to calculate p value and confidence interval (CI) both manually and by R.
this is my data set:

ex0112 = (
   BP       Diet
1   8    FishOil
2  13    FishOil
3  10    FishOil
4  14    FishOil
5   2    FishOil
6   1    FishOil
7   0    FishOil
8  -6 RegularOil
9   1 RegularOil
10  1 RegularOil
11  2 RegularOil
12 -3 RegularOil
13 -4 RegularOil
14  3 RegularOil)

Manually:

(from the book statistical sleuth)

1- Compute the sample standard deviations for each group separately. (Let's call them $s_1$ and $s_2$.)

s1 <- sd(ex0112[ex0112$Diet == "FishOil",]$BP)
s2 <- sd(ex0112[ex0112$Diet == "RegularOil",]$BP)

2- Compute the pooled estimate of standard deviation using the formula
$$s_p = \sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1 + n_2 – 2}}$$
where $n_1$ and $n_2$ group sizes.

n1 <- nrow(ex0112[ex0112$Diet == "FishOil",])
n2 <- nrow(ex0112[ex0112$Diet == "RegularOil",])
sp <- sqrt(((n1-1)*(s1^2)+(n2-1)*(s2^2))/(n1+n2-2))
  1. Compute $\text{SE} (\overline{Y_2} – \overline{Y_1})$ using the formula:
    $$\text{SE} (\overline{Y_2} – \overline{Y_1}) = s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2} }$$
se <- sp*sqrt((1/n1)+(1/n2))
  1. Compute 97.5th percentile of the t-distribution with this many degrees of freedom ($t_{df} (.975)$)
df <- n2+n1-2
t_df <- qt(.975, df)
  1. Construct a 95 % confidence interval fo $\mu_2 – \mu_1$ using the formula in Section 2.3.3:
    $$\left( \overline{Y_2} – \overline{Y_1} \right) \pm t_{df} (1-\alpha/2) \text{SE}( \overline{Y_2} – \overline{Y_1} ).$$
y1 <- mean(ex0112[ex0112$Diet == "FishOil",]$BP)
y2 <- mean(ex0112[ex0112$Diet == "RegularOil",]$BP)
LCI <- (y2-y1) - (t_df * se)
RCI <- (y2-y1) + (t_df * se)
  1. Compute the t-statistic for testing equality:
    $$\left( \overline{Y_2} – \overline{Y_1} \right) / \text{SE}( \overline{Y_2} – \overline{Y_1} ).$$
t <- (y2-y1)/se
  1. Find the one-sided p-value (for the alternative hypothesis $\mu_2 – \mu_1 > 0$) by comparing the t-statistics in (f) to the percentiles of the appropriate t-distribution (by reading the appropriate percentile from R.)
p <- pt(t, df)

Results:

LCI = -13.29674 
RCI = -2.131835
p = 0.00542278

Computer t-test:

t.test(BP ~ Diet, data = ex0112, alternative = "greater",
       var.equal = TRUE, conf.level = 0.95)

Results


    Two Sample t-test

data:  BP by Diet
t = 3.0109, df = 12, p-value = 0.01085
alternative hypothesis: true difference in means between group FishOil and group RegularOil is not equal to 0
95 percent confidence interval:
  2.131835 13.296737
sample estimates:
   mean in group FishOil mean in group RegularOil 
               6.8571429               -0.8571429 

As you see, the absolute values of CI are the same using either method; and so are the p-values.

I found this, which seems to ask the same question, but it seems they use another formula to compute t.

Best Answer

I don't see any difficulties or discrepancies in your comparison of 'manual' and 'computer' results.

Here is a comparison of a pooled 2-sample t test and a Welch 2-sample t test for your oil data. You may want to try 'manual' computation of some parts.

x1 = c(8, 13, 10, 14, 2, 1, 0)
x2 = c(-6, 1, 1, 2, -3, -4, 3)

Repeat of pooled test: same as in your Question:

t.test(x1, x2, var.eq=T)  # pooled

        Two Sample t-test

data:  x1 and x2
t = 3.0109, df = 12, p-value = 0.01085
alternative hypothesis: 
 true difference in means is not equal to 0
95 percent confidence interval:
  2.131835 13.296737
sample estimates:
 mean of x  mean of y 
 6.8571429 -0.8571429 

Welch 2-sample t test:

t.test(x1, x2)

        Welch Two Sample t-test

data:  x1 and x2
t = 3.0109, df = 9.7071, p-value = 0.01352
alternative hypothesis: 
 true difference in means is not equal to 0
95 percent confidence interval:
  1.98202 13.44655
sample estimates:
 mean of x  mean of y 
 6.8571429 -0.8571429 

Sample variances: not shown in output:

var(x1); var(x2)
[1] 34.14286
[1] 11.80952

Differences are as follows:

  • Welch DF $\le$ Pooled DF because sample variances differ, as shown just above. R typically shows non-integer DF, which are actually used in computation. (Some software programs round DF down to the nearest integer, which may make a trivial difference in P-value.)

  • Welch P-value $\ge$ Pooled P-value. A consequence of the difference in degrees of freedom.

  • t statistics both $3.0109.$ Formulas look different, but reduce to equality for equal sample sizes $n_1=n_2= 7.$

  • Sample means the same for both tests

  • 95% Confidence interval for pooled test is $(2.13, 13.30)$ is narrower than $(1.98, 13.45)$ for Welch test.

  • Not shown in R output: The standard error (denominator of t statistic) for the pooled t test is $S_p\sqrt{\frac{1}{n_1}+\frac{1}{n_2}},$ where $S_p^2 = \frac{(n_1-1)S_1^2 + (n_2-1)S_2^2}{n_1+n_2-2}.$ For the Welch t test it is $\sqrt{\frac{S_1^2}{n_1}+\frac{S_2^2}{n_2}}.$

Notes:

It is not a good idea to do a formal test for equality of variances and to let the outcome decide whether to use a pooled or Welch test. Instead, the best practice is always to use the Welch test, unless you have solid prior evidence that the population variances should be the same. In R, notice that the default test is Welch; in order to get a pooled test you have to use parameter var.eq=T with the t.test procedure.

The formula for Welch DF is a little messier than for the pooled test.

Finally, I will let you find the formula for the degrees of freedom of the Welch test in your textbook, class notes, or online. It takes into account sample sizes and sample variances. Different references have formulas that may look quite different, but which can be shown to give the same answers.

Related Question