This is a question that I've always wondered in statistics, but never had the guts to ask the professor. The professor would say that if the p-value is less than or equal to the level of significance (denoted by alpha) we reject the null hypothesis because the test statistic falls in the rejection region. When I first learned this, I did not understand why were comparing the p values to the alpha values. After all, the alpha values were brought in arbitrarily. What is the reason for comparing them to the alpha values and where do the alpha values of 0.05 and 0.10 come from? Why does the statement $ p_\text{value} \leq \alpha$ allow you to reject $H_0$?
[Math] In statistics, why do you reject the null hypothesis when the p-value is less than the alpha value (the level of significance)
hypothesis testingstatistical-inferencestatistics
Related Solutions
We can eliminate out of hand the obviously incorrect choices A, C, and D, leaving only B and E as possibly correct.
A is incorrect, because we know that just because the point estimate is above the hypothesized proportion, that does not imply that the variability or uncertainty in that estimate is small enough to state with a high degree of confidence that the true proportion exceeds the hypothesized proportion. Another way to think about it is that if you flip a fair coin $10$ times, there's quite a reasonable chance you may get at least $6$ heads (more than $1$ in $3$ chance), despite the coin being perfectly fair. Yet, if you flip the same coin $1000$ times and get $600$ or more heads, this is extremely improbable (odds less than $1$ in $7$ billion), despite both sample proportions being $0.6 = 6/10 = 600/1000$.
C is incorrect because the conclusion is opposite of what is required by the $p$-value; i.e., if $p > \alpha$, you fail to reject the null hypothesis, whereas the answer choice states that you would reject (Yes) when $0.064 > 0.05$.
D is incorrect for the same reason as C: the conclusion is opposite of what is required. If $p < \alpha$, you would reject the null hypothesis, whereas the answer choice states that you would fail to reject (No) when $0.032 < 0.05$.
This leaves only B and E as choices. Which to choose depends on which $p$-value is correct. Our hypothesis is $$H_0 : p = p_0 = 0.03 \quad \text{vs.} \quad H_a : p > p_0 = 0.03.$$ The test statistic is $$Z \mid H_0 = \frac{\hat p - p_0}{\sqrt{p_0(1-p_0)/n}} \sim \operatorname{Normal}(0,1).$$ This much you have stated correctly, but the calculation should be $$Z \mid H_0 = \frac{0.04 - 0.03}{\sqrt{0.03 (0.97)/1000}} \approx \frac{0.01}{0.00539444} \approx 1.85376.$$ The $p$-value for this one-sided test is the probability $$\Pr[Z > 1.85376] \approx 0.0318868,$$ which matches the $0.032$ $p$-value in answer choice B.
For your second question, you have insufficient information to compute a $p$-value. What you must do is make a decision based on the information provided. Since the two-sided $90\%$ confidence interval contains the null mean $100$, you know that the two-sided hypothesis test at $\alpha = 1 - 0.9 = 0.10$ must not be able to reject $H_0$ in favor of $H_a$: the uncertainty in your point estimate is too large to rule out with $90\%$ confidence that the true mean is different than $100$, because the CI contains $100$. This also means that $p$-value cannot be less than $\alpha = 0.10$, otherwise you would reject $H_0$. So the correct answer is D.
After doing some research on the topic, I managed to clear my doubts.
Here are some notes/examples which helped me a lot :
https://online.stat.psu.edu/stat415/lesson/20/20.1
https://en.wikipedia.org/wiki/Sign_test
For question 1, my answer/method is correct. However, there was no need to carry out calculations. If null hypothesis is true, we expect half of the number of values to be less than median and half of the number of values to be greater than median. Now, if $H_1$ was true, we expect $S^+>n/2 \implies P(X\ge S^+)>0.5 > 0.05.$ Graphically this means that $S^+$ lies before $n/2$ and outside rejection region.
$\therefore H_1$ cannot be true $\implies$ do not reject $H_o$
For question 2, I still think that my answer/method is correct.
Additional question 1
$H_0$: median = k
$H_1$: median < k
$n=8$ and $S^-=3$.
Again, no calculations are required here. If $H_1$ was true, we would have expected more values smaller than median, i.e, $S^->n/2 \implies P(X\ge S^-)>0.5 > 0.05.$ Graphically this means that $S^-$ lies before $n/2$ and outside rejection region.
Let's try a 'normal' question which requires calculation.
Additional question 1.1
$H_0$: median = k
$H_1$: median < k
$n=8$ and $S^-=5$. Significance level 5%
X: number of values, out of 5, less than median
$P(X\ge 5| X \sim B(8,0.5))=0.36328>0.05$
Do not reject $H_o$
Here's a diagram that illustrates this question.
If $H1$: median > k and $s+<n/2$, on which tail do I place the rejection region?
Rejection region is on the right. But here also no calculation is required because $S^+$ will be on the left.
Notes to myself:
- Ask yourself what is expected if $H_1$ is true?
- Draw the binomial distribution of $X$, the number of values less/greater than median, to better understand what is happening.
If I made any mistakes, please let me know as I am still learning.
Best Answer
Here's the idea: you have a hypothesis you want to test about a given population. How do you test it? You take data from a random sample, and then you determine how likely (this is the confidence level) it is that a population with that assumed hypothesis and an assumed distribution would produce such data. You decide: if this data has a probability less than, say $95$% of coming from this population, then you reject at this confidence level--so $95$% is your confidence level. How do you decide how likely it is for the data to come from a given population? You use a certain assumed distribution of the data, together with any parameters of the population that you may know.
A concrete example: You want to test the claim that the average adult male weight is $170 lbs$ . You know that adult weight is normally-distributed, with standard deviation, say, 10 pounds. You say: I will accept this hypothesis, if the sample data I get comes from this population with probability at least $95$% . How do you decide how likely the sample data is? You use the fact that the data is normally-distributed, with (population) standard deviation=$10$, and you assume the mean is $170$ . How do you determine how likely it is for the sample data to come from this population: the $z-$ value you get ( since this is a normally-distributed variable , and a table allows you to determine the probability.
So, say the average of the random sample of adult male weights is $188lbs$. Do you accept the claim that the population mean is $170$? . Well, the decision comes down to : how likely (how probable) is it that a normally-distributed variable with mean $170$ and standard deviation $10$ would produce a sample value of $188lb$? . Since you have the necessary values for the distribution, you can test how likely this value of $188$ is, in a population $N(170,10)$ by finding its $z-$ value. If this $z-$ -value is less than the critical value, then the value you obtain is less likely than your willing to accept. Otherwise, you accept.