Calculate the p-value

statistics

I'm following a statistics course for the first time and I'm wondering how can I find the difference beteween two indepentent samples. I read the explanation but still couldn't understand how it works and I'm still stuck at finding the p-value. Here is the data:

$$\bar x_1=5.9$$
$$\bar x_2=4.1$$
$$\sigma_1=4.1$$
$$\sigma_2=3.7$$
$$n_1=42$$
$$n_2=47$$
And I want to test:
$$H_0:\mu_1-\mu_2>0$$
$$H_a:\mu_1-\mu_2<0$$

Lets say at a 0.05 level. So I find SE:
$$SE^2=\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}$$
This gives SE = 0.83
So I calculate my t stat:
$$t=\frac{\bar x_1-\bar x_2}{SE}=2.16458$$

How can I calculate the p-value and what would it mean?

Best Answer

The notation $\sigma_i$ rather than $s_i$ suggests you are treating the standard deviations as known rather than as being estimated based on the data. Furthermore, the way you find your standard error makes sense only if that is what you are doing. That is not realistic but is often done in early exercises. Therefore you should treat your test statistic $Z = (\overline x_1-\overline x_2)/\text{SE}$ as being normally distributed rather than as having a t-distribution.

Suppose your null hypothesis is that the means of the two populations are equal and the alternative hypothesis says only that they are not equal, without saying which direction the inequality goes. Then the p-value is the probability that $|Z|,$ the absolute value of $Z,$ exceeds the value actually observed, given that the null hypothesis is true. So, for example, suppose you have $|Z| = 2.3.$ then you need $\Pr(|Z|>2.3) = \Pr(Z>2.3) + \Pr(Z<-2.3) = 2\Pr(Z< -2.3).$ Normally you would find that probability by using either a table or a software package.

Related Solutions

[Math] Getting P-value While Using Variance

for part a), you've made a wrong null hypothesis test. You want to test for evidence that $\sigma_1\gt{1}\to\sigma_1^2\gt{1}$. Thus, your hypotheses should be: $$H_0:\sigma_1^2=1$$ $$H_1:\sigma_1^2\gt1$$ We need $n_1$ which is 5. We also need our sample variance, which you have the square root of. Thus, $S^2=5.6$ and now we calculate our chi-square statistic, which is $$\chi^2=\frac{(n-1)s^2}{\sigma_0^2}=\frac{(5-1)(5.6)}{1}=22.4$$ We know our $\alpha=.05$, so we need to find $\chi_{\alpha}^2(n-1)=\chi_{.05}^2(4)=9.488.$ Since $22.4\ge9.488$, we reject $H_0.$ By p-value, we can see that at $\alpha=.01, P(W\ge{13.28})=.01$, so our p-value is less than .01. Also, $P(W\ge{14.86})=.005$, so it would appear that our p-value for this problem is VERY small indeed!

for part b), we know that $s_1^2=5.6$ and $s_2^2=\frac{(13-10)^2+(7-10)^2+(9-10)^2+(11-10)^2}{4}=5.$. ALso, let m be the number of measurements from the first random sample and let n be the number of measurements from the second random sample. Considering the scope of deriving the confidence, I give it to you freely here: $$.95CI\left(\frac{\sigma_2^2}{\sigma_1^2}\right)=\left[\frac{1}{F_{\frac{\alpha}{2}}(n-1)(m-1)}\frac{s_2^2}{s_1^2},F_{\frac{\alpha}{2}}(m-1)(n-1)\frac{s_2^2}{s_1^2}\right]=\left[\frac{1}{F_{.025}(4)(3)}\frac{5.6}{5},F_{.025}(3)(4)\frac{5.6}{5}\right]$$ Since I gave you the freebee, you should try and finish the hypothesis test for the ratio using what you know of the F-statistic and the information provided...

[Math] How to find the Type 2 Error of an F Test for equality of variances

You have a sample $X_1, X_2, \dots, X_{12}$ sampled at random from $\mathsf{Norm}(\mu_x, \sigma_x)$ and an independent sample $Y_1, Y_2, \dots, Y_{12}$ sampled at random from $\mathsf{Norm}(\mu_y, \sigma_y)$.

Let $\psi = \sigma_x^2/\sigma_y^2.$ You wish to test $H_0: \psi=1$ against $H_a: \psi > 1$ at level $\alpha = 0.01.$

Under $H_0: \psi = 1,$ you have the ratio of the sample variances $R = S_x^2/S_y^2 \sim \mathsf{F}(11,11)$ and you will reject $H_0$ if $R > c,$ where the critical value $c$ cuts 1% of the probability from the upper tail of $\mathsf{F}(11,11)$. You can find $c = 4.462$ from printed tables of the F-distribution or using software; the computation in R statistical software is shown below.

c = qf(.99, 11, 11);  c
## 4.462436

Roughly and intuitively, the observed variance ratio $R = S_x^2/S_y^2$ has to be above 4 in order to reject $H_0.$ This will happen rarely if $\psi = \sigma_x^2/\sigma_y^2 = 1.$ But you want to know the probability of rejection if $\sigma_x = 2\sigma_y$ so that $\psi = 4.$ In that case there should be a reasonable chance that the variance ratio exceeds $c$ and you can reject $H_0$. The probability of Type II error is the probability that you do not reject $H_0$ in these circumstances.

In general, $\frac{S_x^2/\sigma_x^2}{S_y^2/\sigma_y^2} = \frac{S_x^2}{S_y^2}/\psi \sim \mathsf{F}(n_x-1,n_y - 1).$ Thus, if $\psi = 4,$ then the probability of rejection is $P(R \ge c/4) = 0.4296$ and the probability of Type II Error is $P(R < c/4) = 0.5704.$

1 - pf(c/4, 11, 11)
## 0.4296341

In R, it is easy to make a 'power curve', plotting the probability of rejection against $\psi.$ Notice that the power of rejection increases as $\psi$ increases (that is, as the population variances become more different). In the plot below, red lines emphasize the power against the alternative $H_a: \psi = 4.$

psi = seq(1, 6, by=.01);  c = qf(.99,11,11)
p.rej = 1 - pf(c/psi,11,11)
plot(psi, p.rej, type="l", lwd=2, main="Power Curve for Right-Sided F Test")
  abline(v=1, col="green2"); abline(h=0, col="green2")
  lines(c(1,4,4), c(.43, .43, 0), col="red")

Finally, we simulate the power for $m = 10^6$ pairs of samples, each of size $n = 12$ from populations $\mathsf{Norm}(20, 2)$ and $\mathsf{Norm}(25, 1),$ respectively. [The means are not relevant, and $\psi = 2^2 = 4.$] For each of the $m$ pairs we determine whether $H_0$ is rejected at the 1% level, using $c$ as the critical value. As anticipated from our power computation above, the incorrect $H_0$ was rejected for about 43% of the simulated pairs. (A second run gave essentially the same result.)

set.seed(1234)  # use your own seed (or none) for a different simulation
m = 10^6;  n = 12;  df = 11;  c = qf(.99, df, df)
v1 = replicate(m, var(rnorm(n,20,2)))
v2 = replicate(m, var(rnorm(n,25,1)))
r = v1/v2;  mean(r > c)
## 0.430304

The histogram below shows all but a few of the $m$ simulated values of $R$ (with $\psi = 4$). [The maximum variance ratio was above 100.] The density curve of $\mathsf{F}(11,11)$ is shown; it runs off the top of the graph. The critical value is indicated by a vertical red line.

Addendum: Minitab 17 has a number of procedures for making power curves, and one of them is for this two-sample test. It looks at the ratio of standard deviations. Here is relevant printout and a power curve from Minitab.

Power and Sample Size 

Test for Two Standard Deviations

Testing (StDev 1 / StDev 2) = 1 (versus >)
Calculating power for (StDev 1 / StDev 2) = ratio
α = 0.01
Method:  F Test


       Sample
Ratio    Size     Power
    2      12  0.429634

Best Answer

Related Solutions

[Math] Getting P-value While Using Variance

[Math] How to find the Type 2 Error of an F Test for equality of variances

Related Question