Solved – Statistical significance of a survey/poll

samplingstatistical significance

Let's pretend I'm conducting a poll/survey. It's a simple yes/no poll (i.e. everyone only gives 1 of 2 answers). I have asked N people so far, and X of them have said "yes".

I would like to stop asking people (i.e. stop the poll) when I can be sure to some high level of statistical confidence (e.g. 95% sure) that I have something that's accurate.

Does this question even make sense?

This is not a proper stats problem, i.e. "good enough" answers are alright. I don't need a high level of mathematical rigour. Right now I have nothing and I would like something that would at least point me down the path of knowledge. What would I need to do/know/compute/provide to you to figure out what's going on?

Best Answer

That is a very basic (and essential !) question in statistics. The maths behind the answer to this is the central limit theorem. It tells you that no matter what the law of probability is, the averages of N samples behave like a gaussian (the bound is not explicit unless you know the variance of your law).

In the problem you are mentionning you can do something more explicit since the law is rather simple (the law of one answer is called a Bernoulli, and the law of the sum is called a binomial. If p is the probability of "yes", then the variance for a N-sample is N p (1-p), and you can compute explicitly the N you need in order to make a mistake of less than say 5% with a 95% probability (you need both a margin of error and a trust interval for the questino to be well-posed).

Best Answer

Related Solutions

Survey Sampling – How to Calculate Confidence Levels of a Stratified Sample with Missing Units

Related Question