Solved – How to construct a 90% confidence interval for the standard deviation of a set in R?

confidence intervalrstandard deviation

For a homework assignment we are supposed to calculate this in R. We are given a CSV with two columns, i.e.:


Given CSV file


The exact question we are given is:

Construct a 90% confidence interval for the s.d. of birth weight for
mothers who
smoke. Do the same for mothers who don’t smoke.

Now, my understanding of statistics (which is very little at that), you can only take a confidence interval on a set of "numbers". Given that we are to take the standard deviation of the set of numbers which returns a single number, how would one take a confidence interval on that?

This was my attempt:

> with(data, t.test(sd(BirthWeight[Smoker=='Yes']), conf.level=.90))
Error in t.test.default(sd(BirthWeight[Smoker == "Yes"]), conf.level = 0.9) :
  not enough 'x' observations
>

Now this isn't working because I am only passing the t.test function a single number.

Note: data is the read in CSV file:

data <- read.table("BirthwtSmoke.csv",header=TRUE)

Best Answer

If you are doing this calculation by hand, there are a number of steps you will need to do.

First, sort your data by the Smoker column/variable. This will give you the two data sets you will use.

Second, obtain the sample size $n$ and the (sample) standard deviation $s$ for each data set.

Third, calculate the critical values needed for your calculation using the $\chi^2$-distribution:

qchisq(alpha/2,n-1,lower.tail=TRUE)
qchisq(alpha/2,n-1,lower.tail=FALSE)

for your given values of $\alpha$ and $n$. The first value is the right-side critical value $\chi_{R,cv}^2$; the second value is the left-side critical value, $\chi_{L,cv}^2$.

Finally, you can calculate the confidence interval: $$\frac{(n-1)s^2}{\chi_{R,cv}^2} < \sigma^2 < \frac{(n-1)s^2}{\chi_{L,cv}^2}$$ Well...almost finally, if you want the C.I. for the standard deviation, you need to take the square root of everything in the trilinear inequality: $$\sqrt{\frac{(n-1)s^2}{\chi_{R,cv}^2}} < \sigma < \sqrt{\frac{(n-1)s^2}{\chi_{L,cv}^2}}$$

If there is an R function, maybe someone will add it in the comments.

Related Question