Solved – Determining the confidence interval for a non-normal distribution

bootstrap

I have a set of data I ran on a simulation using R with a population size (N) of 1000, sample mean of 64.93, and a standard deviation of 27.61. The distribution is positively skewed and a Wilks-Shapiro test done in JMP shows that the data is not normally distributed. I need the confidence intervals of this data to be able to test some experimental data I have.

I have learned that using the boot.ci() function in R can give me confidence intervals using bootstrapping.

As shown in http://www.statmethods.net/advstats/bootstrapping.html, the boot function requires:
bootobject <- boot(data= , statistic= , R=), where data is the data of interest, statistic is a function that produces the statistic to be bootstrapped, and R is the number of times to do it.

My data is simply a vector of 1000 samples. I am not sure what statistic I should do. All the examples I have read online require two data sets to make a correlation statistic or a linear regression statistic.

The data can be downloaded from: http://ktdllc.com/data/simp2.csv

Any help on this issue would be greatly appreciated.

Best Answer

If you want to just do a simple bootstrap of a median CI then all you need is a median function that accepts indices.

med <- function(y, indices) median(y[indices])

Then you can just...

b <- boot(dat, med, 1000)
boot.ci(b)

(you might want to plot(b) to examine your boostrap)

But in your case, assuming your simple exponential model described the data well, I would just be tempted to let the entire simulation speak for itself in a figure. Plot your data distribution and overlay your simulation. If you use density curves your simulation can have a much higher N than the data.

densDat <- density(dat)
plot(densDat)
densSim <- density(sim)
lines(densSim, col = 'red')

(You might want to adjust the bw argument in density to smooth the curve.)

Related Question