Understanding Significance level in statistics

descriptive statisticshypothesis testingstatistics

I was trying to understand the meaning of significance level. It says that it is the probability of rejecting a null hypothesis when it may be true.
Also it states that the lower the value of alpha = the more stricter is the testing.

I want to understand that in the example of a graph it shows more area shaded at 5% significance level. Doesn't it mean that the target is much narrower and we would get a more accurate answer ?.

Also what does it mean when we reject a null hypothesis at say 5% significance level but accept it at 1% significance level. Does the percentage entail to the probability or something else ?

Best Answer

I made a random number generator to make sets of 30 whole numbers from 1 to 101, distributed evenly. If my random number generator is working properly, the mean should be 50. Suppose the numbers looked kind of high and I wanted to know if my random number generator was working right. I could ask myself what is the probability of getting the numbers I did just by random chance? It's a little tough to do if all I have are one set of 30 numbers, but let's try anyways.

I figure if the numbers I got only had a 10% chance of being as extreme as they are, then something's probably up and there's a flaw with my random number generator. The number could have been too high or too low, so I'll split that up into 5% for too high and 5% for too low. The z-score associated with 5% is 1.64

The null hypothesis is that the random number generator is OK and creates sets with a mean of 50. The alternate hypothesis is that the random number generator is flawed and creates sets with a mean that's not 50.

I calculate the mean and standard deviation of the sample, and get Mean of 57.73 and standard deviation of 27.95. I divide the standard deviation by the square root of 30 (number of items in the sample) and use that as my sampling standard deviation: 5.10. 57.73 is 7.73 above the expected mean of 50, so that is 7.73/5.10 = 1.52 deviations above the mean. That's less than my chosen level of significance so I'll go with the null hypothesis. Even though the mean is somewhat high, I guess it's just by chance.

The level of significance is the chance that random numbers just happened to be really off. If something is in the 5% significance range then there was only a 5% chance of it being a randomly high (or low) sample. If you kick it up to 1% then there was only a 1/100 chance of random number ending up that way.

The problem with using too small a significance means that you may discard a more subtle effect. Maybe there really is something going on, but it's not a powerful enough effect to drive the sample into 5% or 1% or whatever % range.

Related Question