MATLAB: How to specify stderr when using bootci to compute bootstrap studentized confidence interval

bootcibootstrapstudentized

I want to compute the studentized bootstrap interval for the mean. The quantity to be computed for each bootstrap sample should be:

(mean(x)-mean(data))./std(x)

where x is a bootstrap sample. I can program this on my own, but I am not sure how to specify the same thing using the function bootci:

ci = bootci(B,{function1,data},'type','stud','stderr',function2);

What should I write for function1 and function2? I have tried several things but the coverages I get from these intervals turn out to be wrong (compared to my own program which I know is correct). I am specifying something incorrectly. Do I specify the divisor std(bootstrapsample) in function1, or in function2 (or in both)? The documentation is sparse.

Best Answer

It is not exactly clear what you are trying to do as you introduce x without explaining what it is. Assuming that you want to bootstrap samples and create a confidence interval round the mean then you do not need to specify function2. function1 would simply be a handle to the mean function as this is what you are looking for the interval on.

ci = bootci(B,{@mean,data},'type','stud')

This will give you a confidence interval around the mean for the variables in each column of data.

Related Solutions

MATLAB: Bootstrap Confidence Interval 90%

"We were asked to calculate the 90% confidence interval [using bootci()]... Is this the correct way?"

To determine if it's the correct way, compare the results with a lower level computation of the confidence intervals. Not only will this confirm that you're using the bootci() function correctly but you'll get a better understanding of how those intervals are computed.

Let's say your data contain 1000 samples and you're bootstrapping the mean of your data 2000 times. bootci() resamples the data with replacement 2000 times and computes the mean on each iteration. That means on each of the 2000 iterations, it randomly chooses 1000 samples, many of which will be duplicates (it uses the randi() function), and computes the mean. It then uses the 2000 means that were computed to determine the CI. There are several methods of doing this (explained under "types" in the documentation) but with enough bootstraps, the distribution of means should be normal (thanks to the Central Limit Theorem) and the CI type shouldn't make too much of a difference (though I typically recommend the percentile method which is not dependent on the shape of the distribution).

Follow thie brief tutorial below where the CIs are computing for a random dataset using bootci() and using a lower-level, direct computation of the CIs. As you can see by the figure it produces, the results are nearly the same. The only differences are due to a different randomized resampling of the data between methods.

% Create random data from a normal distribution  
% with mean 28.25 and sd 8.5.  
data = (randn(1,100000)*8.5 + 28.25)'; 
% Run bootci (percentile method is chosen since that's how we're 
% computing it below in the other method.  
nBoot = 2000; %number of bootstraps
[bci,bmeans] = bootci(nBoot,{@mean,data},'alpha',.1,'type','per'); %90 confidence interval
% Compute bootstrap sample mean
bmu = mean(bmeans); 
% Now repeat that process with lower-level bootstrapping
% using the same sampling proceedure and the same data.
bootMeans = nan(1,nBoot); 
for i = 1:nBoot
    bootMeans(i) = mean(data(randi(numel(data),size(data)))); 
end
CI = prctile(bootMeans,[5,95]); 
mu = mean(bootMeans); 
% Plot the bootci() results
figure()
ax1 = subplot(2,1,1);
histogram(bmeans); 
hold on
xline(bmu, 'k-', sprintf('mu = %.2f',bmu),'LineWidth',2,'FontSize',12)
xline(bci(1),'k-',sprintf('%.1f',bci(1)),'LineWidth',2,'FontSize',12)
xline(bci(2),'k-',sprintf('%.1f',bci(2)),'LineWidth',2,'FontSize',12)
title('bootci()')
% plot the lower-level, direct computation results
ax2 = subplot(2,1,2);
histogram(bootMeans); 
hold on
xline(mu, 'k-', sprintf('mu = %.2f',mu),'LineWidth',2,'FontSize',12)
xline(CI(1),'k-',sprintf('%.1f',CI(1)),'LineWidth',2,'FontSize',12)
xline(CI(2),'k-',sprintf('%.1f',CI(2)),'LineWidth',2,'FontSize',12)
title('Lower level')
linkaxes([ax1,ax2], 'xy')

"Next we were asked to use the bootstrap technique to estimate the 90% confidence interval for the probability that the mean of Pb exceeds the MCL (i.e., 50ppm)."

I'm not sure I follow this part. What is MCL? Is it a scalar value (like 50) or is is the mean of a 2nd distribution?

MATLAB: Reproducible results using “bootci” matlab function

If you insert

reset(s)

before the second call to bootci , you should find that the two calls yield the same answer.

Best Answer

Related Solutions

MATLAB: Bootstrap Confidence Interval 90%

MATLAB: Reproducible results using “bootci” matlab function

Related Question