Solved – Paired samples t test in Python

paired-datapythonstatsmodelst-test

I'm trying to conduct a paired samples t-test in Python (statsmodels package), but I don't see a function for it in their documentation. The closest I can find is ttost_paired, but I don't think its correct as their null hypothesis is that the mean difference is > or < some boundary value, whereas for my desired test the null is x1 - x2 = 0

A few questions:

Is there a way to do a paired samples t-test in statsmodels that I'm missing?
Is there a way to use ttost_paired to do what I want?
They do also have an independent samples ttest. What could go wrong if I use an independent ttest on paired data?

I know I can do a paired t-test using scipy but I'm wondering specifically about statsmodels

Best Answer

the function ttost is not a t-test and therefore is not suitable for your purposes. The TTOST is a test of non-equivalence. It employes two one-sided t-tests in order to verify if both samples are equivalent or not. Please, have a look at the function documentation.

There exists the ttest_mean function on the statsmodels package. However, it does not indicate if the test is conducted with paired samples or not. Thus, I recommend you to use the scipy.stats t-test.

And about your last question:

They do also have an independent samples ttest. What could go wrong if I use an independent ttest on paired data?

The paired t-test reduces intersubject variability. Thus, it is theoretically more powerful than the unpaired t-test.

Related Solutions

Solved – statistical significance in the paired sample data after performing Wilcoxon signed rank test

Common practise is to compare p-value with three levels - 0.05, 0.01 and 0.001. Since your p-value is less than each of them, you have to choose the smallest one, so you should conclude that differences are significant and p<0.001. Roughly speaking: The smaller the p-value, the more significant differences are.

Since we do not know distribution of your data, we do not also know which test should you use. But you have quite large sample, so there is high chance that parametric test can be appropriate (t-test for paired data).

T-Test – Critical Effect Sizes and Power for Paired T-Tests

Yes, this is possible and even fairly easy, but additional information is required. Specifically, we have to make an assumption about what the correlation between the observations from each pair are.

The effect size as a difference in standard deviation units is usually referred to as $d$. We can apply a correction factor to $d$ to incorporate the information about the aforementioned correlation, and then we can use our standard power formulae with this corrected $d$ (making sure to also mind the change in degrees of freedom associated with moving to the paired design) to compute power. The corrected $d$ is $$ d_o = \frac{d}{\sqrt{1-r}}, $$ where $r$ is the correlation. I have called this $d_o$ because this is sometimes referred to as the "operative effect size."

Here is a little R routine that computes a table of minimum number of PAIRS as a function of the assumed correlation and the desired power level, with $d=2$ assumed.

library(pwr) # package for pwr.t.test() function
             # may need to install first with install.packages()

# define a function to get the minimum number of pairs
# for a given correlation and desired power level
getN <- function(r,p){
  unlist(mapply(pwr.t.test, d=2/sqrt(1-r), power=p,
    MoreArgs=list(n=NULL, sig.level=.05, type="paired"))["n",])
}

# apply this function to all combinations of the parameters below
tab <- outer(seq(0,.95,.05), c(.7,.8,.9,.95,.99,.999), "getN")
dimnames(tab) <- list("Correlation"=seq(0,.95,.05),
                  "DesiredPower"=c(.7,.8,.9,.95,.99,.999))
tab

Which returns the following:

           DesiredPower
Correlation      0.7      0.8      0.9     0.95     0.99    0.999
       0    3.767546 4.220731 4.912411 5.544223 6.888820 8.656788
       0.05 3.691858 4.126240 4.787326 5.389850 6.669683 8.350091
       0.1  3.615930 4.031562 4.662220 5.235637 6.451021 8.044096
       0.15 3.539645 3.936653 4.537050 5.081483 6.232774 7.738792
       0.2  3.462940 3.841433 4.411750 4.927417 6.014903 7.434270
       0.25 3.385708 3.745774 4.286234 4.773338 5.797404 7.130529
       0.3  3.307922 3.649640 4.160447 4.619143 5.580267 6.827580
       0.35 3.229382 3.552889 4.034209 4.464751 5.363362 6.525430
       0.4  3.149970 3.455310 3.907393 4.309986 5.146613 6.224026
       0.45 3.069435 3.356743 3.779777 4.154653 4.929824 5.923282
       0.5  2.987581 3.256903 3.651065 3.998456 4.712773 5.623032
       0.55 2.904079 3.155423 3.520841 3.841020 4.495111 5.323066
       0.6  2.818472 3.051834 3.388672 3.681805 4.276260 5.022875
       0.65 2.730145 2.945449 3.253781 3.520048 4.055501 4.721751
       0.7  2.638237 2.835369 3.115118 3.354639 3.831565 4.418560
       0.75 2.541442 2.720152 2.971074 3.183823 3.602697 4.111397
       0.8  2.437713 2.597460 2.819127 3.004879 3.365682 3.796879
       0.85 2.323340 2.463226 2.654597 2.812710 3.114890 3.468745
       0.9  2.190677 2.309002 2.467897 2.596901 2.838233 3.113596
       0.95 2.018024 2.110699 2.231866 2.327720 2.501567 2.692358

Note that $d=2$ is considered in many fields quite a large effect size, so the resulting minimum numbers of pairs are all quite low.

Best Answer

Related Solutions

Solved – statistical significance in the paired sample data after performing Wilcoxon signed rank test

T-Test – Critical Effect Sizes and Power for Paired T-Tests

Related Question