[Math] Help Understanding Difference in P-Value & Critical Value Results

hypothesis testingstatistics

I'd appreciate help in understanding how changing the significance level effects the results of the t-test.

I have conducted an experiment where a group of 15 participants took a test, played a game, and took the original test again. The data set follows:

  • Round 1 (Before Game) Scores: 6, 4, 7, 8, 12, 6, 7, 5, 11, 4, 7, 1, 6, 10, 4

  • Round 2 (After Game) Scores: 2, 3, 7, 11, 11, 9, 7, 12, 5, 15, 11, 11, 7, 4, 7

  • mean test score before game play: 6.53

  • mean test score after game play: 8.13

Accordingly I formulated a null hypothesis that game play does not effect test scores and an alternative hypothesis that game play increases scores (see below). Using the data and R I calculated the t-statistic, critical value, and p-value

$H_0: \mu_0 = 6.53$ and $H_1: \mu_1 > 6.53$

$\alpha = 0.05, \mu_0 = 6.53, \overline x = 8.13, \sigma = 3.70, n = 15$

$$ t = \frac{8.13 – 6.53}{\frac{3.70}{\sqrt 15}} = 1.67 $$

Critical value = 1.76 and p-value = 0.94

T-value < Critical Value $ \to $ $1.67 < 1.76 \therefore$ accept $H_0$

$p-value > \alpha$ $\to 0.94 > 0.5 \therefore$ accept $H_0$

But when I re-calculate with a $\alpha$ of 0.1 the critical value changes to 1.35, while the p-value stays the same at 0.94. At this point, accepting/rejecting diverges based on which value comparison is made. Did I make a mistake in the calculation or am I misunderstanding some other factor(s)? Thanks.

Best Answer

You have a paired design. It is the same $n = 15$ students taking the test both times. Let's call the first score for the $i$th subject $X_i$ and the second score $Y_i.$ You want to do a one-sample z-test of the differences $D_i = X_i - Y_i.$ You don't give the individual scores, but the averages are $\bar D = \bar X - \bar Y.$

The null hypothesis is $H_O: \mu_D = 0$ (no different after playing the game) and $H_a: \mu_d > 0$ (better scores after playing the game).

The test statistic is $$Z = \frac{\bar D - 0}{\sigma/\sqrt{n}} = 1.67.$$ The critical value at the 5% level is the value $c = 1.645$ that cuts 5% from the upper tail of the standard normal curve. Because $T = 1.67 > c = 1.645,$ you reject the null hypothesis and conclude that the game might have enabled the students to get better scores on the second test. (Or maybe learned something from taking the first test!)

However, $T$ exceeds $c$ by only a little, and evidence is not 'strong'. If you subject the findings to a more stringent standard and test at the 1% level, then the critical new value $c^\prime = 3.326$ that cuts 1% from the upper tail of the standard normal distribution. According to this more stringent standard, you do not reject the null hypothesis.

The P-value is the probability to the right of $Z = 1.67$ under the standard normal curve. That probability is 0.47. With the p-value, we can test at any desired level of significance. In particular, at the 5% level, we reject because $.047 < .05 = 5\%$. However, at the 1% level, we do not reject because $.047 > .01 = 1\%.$

In case it is useful, I pasted output below (somewhat abridged) from doing this test in Minitab statistical software:

 One-Sample Z 

 Test of mu = 0 vs > 0
 The assumed standard deviation = 3.7


  N   Mean  SE Mean      Z      P
 15  1.600    0.955   1.67  0.047
Related Question