Hypothesis testing for reaction time activity

hypothesis testingstatistics

I need help with completing a HT with the following data:

Table 1: Experiment 1 – Reaction Time Activity
\begin{array}{|l|c|c|c|}
\hline
& \text{Completion time} & \text{Completion time} & \text{Difference} \\
& \text{(Dominant Hand)} & \text{(Non-Dominant Hand)} & \text{(Dominant – Non)} \\ \hline
\text{Number of} & & & \\
\text{observations} & 1491 & 1491 & 1491\\ \hline
\text{Sample mean} & 0.4259 & 0.482 & -0.056\\ \hline
\text{Sample standard} & & & \\
\text{deviation} & 0.153 & 0.191& 0.140\\ \hline
\end{array}

"Perform a hypothesis test to assess whether there is a difference in completion time, on average, when undergraduate students use their dominant vs. non-dominant hand, using a $5\%$ significance level. Be sure to include:

  • Definition of the parameter of interest in context
  • Your hypotheses and results of your test. You can include either a screenshot from R or a table of important values (test statistic, distribution, and $p$-value) calculated by hand.
  • An evaluation of the $p$-value and conclusion in context
  • Define alpha and power in context
  • Explain two ways in which you could increase the power of this test

My work so far:

We need to conduct a two-sided HT to assess whether there is a difference in completion time, on average, when undergraduate students use their dominant vs non-dominant hand.

$H_0$: There is no difference in average completion of the reaction time assessment based on a student using their dominant or nondominant hand in the population of undergrads. $\mu_1 – \mu_2 = 0$

$H_a$: There is a difference in average completion of the reaction time assessment based on a student using their dominant or nondominant hand in the population of undergrads. $\mu_1 – \mu_2 \neq 0$

I am unsure of what to do next. Do I conduct a T-test (I would be using R Studio, so using the t.test() function)?

To increase the power of this test, would I just decrease the significance level?

Thank you!

Best Answer

The choice of test statistic depends on the sampling methodology. In other words, how is the data collected? What does the data mean?

To illustrate, we could sample the data like this: we recruit a random sample of $n_1$ students and each student does one reaction time assessment with their dominant hand. Then we recruit a random sample of $n_2$ students and each of these does the reaction time assessment with their non-dominant hand. This would be two-sample independent $t$-test.

Or we could recruit $n$ students, and each student does the reaction time assessment twice, once with the non-dominant hand, and once with the dominant hand. This would be a paired $t$-test.

Or we could ask each student to perform the test $2m$ times: $m$ times each with the dominant hand and $m$ with the non-dominant hand. Now we have a repeated measures paired $t$-test. You could even design it as a crossover study.

Without specifying how the data is collected, it is not clear how the test should be performed.


Since you specified that the data is collected in a paired manner, then the appropriate R syntax is

t.test(x, y, paired = TRUE, alternative = "two.sided")

Note that the alpha level is the maximum probability of a Type I error you would like to allow--in other words, the probability that you would conclude to reject the null hypothesis, given that there is no difference in reaction times, is at most $\alpha$.

The power is the probability that you do correctly reject the null, given that there is a difference in reaction times.

To increase the power of the test, you would increase the sample size. Decreasing $\alpha$ actually reduces the power of the test, because by decreasing your tolerance for incorrectly rejecting the null, you are raising the standard by which you must observe a difference in the data in order to reject. This also has the effect of making it harder to reject when there does in fact exist a difference, which means the power decreases.

Instead, what you can do is keep the $\alpha$ the same, and increase the sample size. This will increase the power because with a larger sample size of students, the variance of your estimate of mean reaction times will decrease, thus giving you more precision and ability to discriminate whether the mean reaction times are in fact truly different. Or, you could relax your criterion for rejecting the null hypothesis by increasing $\alpha$, but this also has the unfortunate effect of increasing the probability of concluding that the reaction times are different when they might not be.


Second edit. Now that I can see the table in which the data is presented, you cannot use t.test at all because that requires a list of individual observations. What you are given is the relevant summary statistics. So what you must do is perform the test manually, using the formulas for a paired test.

Since the sample size is very large--in the thousands--I would advise performing the paired test as a $z$-test, not $t$. The difference is negligible. The test statistic is simply $$z = \frac{\Delta}{s/\sqrt{n}},$$ where $\Delta$ is the sample mean of the paired differences, and $s$ is the sample standard deviation of the paired differences. This ratio is approximately standard normal. You would compute this value, compare it against the critical value for a test of level $\alpha$ (e.g., if $\alpha = 0.05$, then the critical value for a two-sided test is $z_{\alpha/2}^* \approx 1.96$. If $|z| > 1.96$ then you would reject $H_0$. If not, you fail to reject. The $p$-value of the test is $$2 \Pr[Z > |z|],$$ twice the probability that a standard normal random variable is larger than the absolute value of your test statistic.

Related Question