I have a set of data I ran on a simulation using R with a population size (N) of 1000, sample mean of 64.93, and a standard deviation of 27.61. The distribution is positively skewed and a Wilks-Shapiro test done in JMP shows that the data is not normally distributed. I need the confidence intervals of this data to be able to test some experimental data I have.
I have learned that using the boot.ci()
function in R can give me confidence intervals using bootstrapping.
As shown in http://www.statmethods.net/advstats/bootstrapping.html, the boot function requires:
bootobject <- boot(data= , statistic= , R=)
, where data is the data of interest, statistic is a function that produces the statistic to be bootstrapped, and R is the number of times to do it.
My data is simply a vector of 1000 samples. I am not sure what statistic I should do. All the examples I have read online require two data sets to make a correlation statistic or a linear regression statistic.
The data can be downloaded from: http://ktdllc.com/data/simp2.csv
Any help on this issue would be greatly appreciated.
Best Answer
If you want to just do a simple bootstrap of a median CI then all you need is a median function that accepts indices.
Then you can just...
(you might want to
plot(b)
to examine your boostrap)But in your case, assuming your simple exponential model described the data well, I would just be tempted to let the entire simulation speak for itself in a figure. Plot your data distribution and overlay your simulation. If you use density curves your simulation can have a much higher N than the data.
(You might want to adjust the
bw
argument indensity
to smooth the curve.)