I have two samples and I would like to determine whether the difference between them is statistically significant or not:
Because this is clearly data of small counts that cannot be approximated by a normal distribution, I don't think a t-test is appropriate. Instead, I believe I may assume that these counts follow (two distinct) Poisson distributions. As stated here, each of these Poisson distributions is best summarized by the respective sample means, as follows:
$$
\overline{y} = \frac{1}{N} \sum_{i = 1}^n{y_i}
$$
This obviously yields non-integer means (0.4
and 2.6
, respectively), so I cannot use functions like poisson.test()
or any of the functions from the exactci
library directly.
I am aware of this answer regarding the C-test and E-test – but is there a straightforward implementation in R that would do this?
The only reasonable way to do this using poisson.test()
I can think of would be to sum the rates for each condition and use the numbers of counts as the respective T
parameters:
# Count data for each respective Condition
Cond1 <- c(0, 0, 0, 1, 1)
Cond2 <- c(1, 2, 3, 3, 4)
poisson.test(
x = c(sum(Cond1), sum(Cond2)),
T = c(length(Cond1), length(Cond2)),
alternative = "two.sided"
)
which yields the following result:
Comparison of Poisson rates
data: c(sum(Cond1), sum(Cond2)) time base: c(length(Cond1), length(Cond2))
count1 = 2, expected count1 = 7.5, p-value = 0.007385
alternative hypothesis: true rate ratio is not equal to 1
95 percent confidence interval:
0.01685531 0.67955077
sample estimates:
rate ratio
0.1538462
Is this valid or is there a better way to do this in R?
Best Answer
This is what I would do which I believe it is what @Dave is hinting at in comment.
Fit GLM with Poisson familiy distribution and see the effect of "condition" on the mean of each group.
Prepare the dataset:
Fit the model and assess the significance of "condition":
So the mean of Cond 1 is estimated as
exp(-0.9163) = 0.4
, the mean of Cond 2 isexp(-0.9163 + 1.8718) = 2.6
and the difference has p = 0.0137 for the null hypothesis of being zero.