Solved – 95% confidence interval for mean of a large sample

confidence intervalr

I have a large sample of experimental observations for different categories (specifically, the runtime of an algorithm in different scenarios). I want to plot the mean runtime for each category/scenario and also show the 95% confidence interval using R.

According to the central limit theorem, the mean of each category should be normally distributed (because it is based on a large number of independent observations).

I know how to plot the means as scatter plot and how to add error bars. I'm just unsure about the 95% confidence interval. The 95% confidence interval is the interval in which a new value lays with 95% probability? Or is only the actual mean in the interval with 95% probability?

I found this code on calculating the confidence interval:

error <- qnorm(0.975)*sd/sqrt(n)

Where n is the sample size and sd is the standard deviation. Unfortunately, it lacks further explanation. What exactly is qnorm(0.975) and why do we choose 0.975 to get the 95% confidence interval?

Best Answer

qnorm is the quantile function for the normal distribution. More details are available by typing ?qnorm. You pick 0.975 to get a two-sided confidence interval. This gives 2.5% of the probability in the upper tail and 2.5% in the lower tail, as in the picture. Two-Tailed normal distribution