Solved – How to choose the null and alternative hypothesis

hypothesis testingself-study

I'm practicing with the hypothesis test and I find myself in trouble with the decision about how to set a null and an alternative hypothesis. My main issue is to determine, in every situation, a "general rule" on how I can decide correctly which is the null and which is the alternative hypothesis.. can someone help me?

Here is an example:
As an established scholar, you are requested to evaluate if Customer Relationship Management affects the financial performance of firms. The main issue will be solved by means of a test of hypothesis. Two hypothesis will be tested one against the other: CRM is related to performance, CRM is not related.

Thanks.

Best Answer

The rule for the proper formulation of a hypothesis test is that the alternative or research hypothesis is the statement that, if true, is strongly supported by the evidence furnished by the data.

The null hypothesis is generally the complement of the alternative hypothesis. Frequently, it is (or contains) the assumption that you are making about how the data are distributed in order to calculate the test statistic.

Here are a few examples to help you understand how these are properly chosen.

  1. Suppose I am an epidemiologist in public health, and I'm investigating whether the incidence of smoking among a certain ethnic group is greater than the population as a whole, and therefore there is a need to target anti-smoking campaigns for this sub-population through greater community outreach and education. From previous studies that have been published in the literature, I find that the incidence among the general population is $p_0$. I can then go about collecting sample data (that's actually the hard part!) to test $$H_0 : p = p_0 \quad \mathrm{vs.} \quad H_a : p > p_0.$$ This is a one-sided binomial proportion test. $H_a$ is the statement that, if it were true, would need to be strongly supported by the data we collected. It is the statement that carries the burden of proof. This is because any conclusion we draw from the test is conditional upon assuming that the null is true: either $H_a$ is accepted, or the test is inconclusive and there is insufficient evidence from the data to suggest $H_a$ is true. The choice of $H_0$ reflects the underlying assumption that there is no difference in the smoking rates of the sub-population compared to the whole.

  2. Now suppose I am a researcher investigating a new drug that I believe to be equally effective to an existing standard of treatment, but with fewer side effects and therefore a more desirable safety profile. I would like to demonstrate the equal efficacy by conducting a bioequivalence test. If $\mu_0$ is the mean existing standard treatment effect, then my hypothesis might look like this: $$H_0 : |\mu - \mu_0| \ge \Delta \quad \mathrm{vs.} \quad H_a : |\mu - \mu_0| < \Delta,$$ for some choice of margin $\Delta$ that I consider to be clinically significant. For example, a clinician might say that two treatments are sufficiently bioequivalent if there is less than a $\Delta = 10\%$ difference in treatment effect. Note again that $H_a$ is the statement that carries the burden of proof: the data we collect must strongly support it, in order for us to accept it; otherwise, it could still be true but we don't have the evidence to support the claim.

  3. Now suppose I am doing an analysis for a small business owner who sells three products $A$, $B$, $C$. They suspect that there is a statistically significant preference for these three products. Then my hypothesis is $$H_0 : \mu_A = \mu_B = \mu_C \quad \mathrm{vs.} \quad H_a : \exists i \ne j \text{ such that } \mu_i \ne \mu_j.$$ Really, all that $H_a$ is saying is that there are two means that are not equal to each other, which would then suggest that some difference in preference exists.

Related Question