I've been trying to figure out the correct way to calculate the p-value for my data. I originally created a simulation that randomly selected numbers that were greater than or less than a certain number in a specific range given by a dataset. Let me give you an example of what my datasets looked like for clarity:
Expected dataset:
exon_number number_of_exons
4 20
5 16
2 31
4 15
15 20
Observed dataset:
exon_number number_of_exons
21 30
15 18
16 20
For each line in my datasets, I randomly selected say 100 numbers between 1 and 20 (for an example from the expected dataset) and determined if the randomly selected number was greater than or less than the exon_number. If it were greater than, I would bin it to the greater than bin. I would do this for all the lines in my datasets and created a total greater than or less than bin for my entire dataset. However, since my datasets were of different sizes, there were a greater amount of greater-thans or less-thans complied for the "expected dataset". Is this problematic? Here are my real results:
Expected Observed
Less than 698402 11105
Greater than 918898 13573
I understand that the Fisher's exact test is only for small numbers and should not be used, am I correct? I'm trying to test if the observed data seems to cluster more in the beginning or end of a transcript compared to the expected results.
In that case, I've been using the chi-squared method as below:
import numpy
import scipy.stats
scipy.stats.chisquare([11105, 13573], f_exp=[698402, 918898])
However my output gives me a p-value of 0. Am I doing something wrong? Am I running the test incorrectly? Is my data problematic? I'm new to programming and statistical testing. Any help would be greatly appreciated (and explanations)
Best Answer
You are right that you don't want to use Fisher's exact test here. There isn't anything wrong with using it with large numbers, but it ends up being approximated then, so you lose the 'exact test' advantage that people sometimes want. In addition, Fisher's exact test assumes the marginals are fixed in advance, which isn't true here (and is in fact rarely true).
The reason your chi-squared test is not working properly is that the numbers for the expected and observed are not similar. The expected counts sum to
1617300
, whereas the observed sum to24678
. What you really want to compare your observed counts to is the expected proportions. Using your data (andR
), here is an example:I do not believe this is the right analysis for your question, though. I suspect you need to something like a Mann-Whitney U-test or Wilcoxon signed rank test.