Solved – Binomial Distribution & t-Test

binomial distributionhypothesis testingnormal distributionr

Starting with a Binomial Distribution with parameters $n=1000, p=0.5$ and measured successes of 300, I would like to test whether there is a significant difference between success and failure.

The obvious solution (using R):

n1 <- 300
n2 <- 700
p <- 0.5
binom.test(n1, n1+n2, p, alternative='two.sided')

I also want to use the similarity of the Binomial and Normal Distribution for "large" numbers of observations. A trivial solution may be this:

t.test(c(rep(0, n1), rep(1, n2)), mu=p, alternative='two.sided')

Properties of a Normal Distribution based on a Binomial Distribution can be calculated directly:

mu <- (n1+n2)*p
sig2 <- p*(1-p)*(n1+n2)

Therefore it should be possible to simply apply a one-sample t-test. After some trial-and-error I got this solution:

t <- (n2-mu)/sqrt(sig2)
p.value <- 2*abs(1-pt(abs(t), n1+n2-1))

Luckily the results are rather similar.

I do not understand why the t-test stated for example in Wikipedia, where an additional $\sqrt{n_1+n_2}$ is used, does not produce the right result:

t.wrong <- sqrt(n1+n2)*(n2-mu)/sqrt(sig2)

Why do I have to omit this part of the tests formula?

Best Answer

The Wiki one is talking about the sample mean. But here you only have one sample, that is 700 for binomial (N=1000). Do not confuse the binomial parameter N with your sample size.