MATLAB: Bootstrap standard errors for nonlinear least squares regression

bootcibootstraplsqnonlinMATLABnonlinear least squaresstandard errors

Dear All,
I am interested in obtaining the variance covariance matrix for my parameters – x (15 by 1) – which are the solution to the following nonlinear least squares minimization problem:
My data consist of 18026 rows and 15 columns:
X = [X1,X2,X3,X4,X5,X6,X7,X8,X9,X10,X11,X12,X13,X14];
y = eret2; I need to estimate 15 parameters. My function is:
function F = myfun(x,y,X) F = (y – (1 + x(1)*(x(2)/0.0025662))*X(:,1) – (x(2)/x(3))*(sqrt(1-x(1)^2))*X(:,2) – x(4)*X(:,3) – x(5)*X(:,4)) – x(6)*X(:,5) – x(7)*X(:,6) – x(8)*X(:,7) – x(9)*X(:,8) – x(10)*X(:,9) – x(11)*X(:,10) – x(12)*X(:,11) – x(13)*X(:,12) – x(14)*X(:,13) – x(15)*X(:,14);
end
Constraints are: lb = [-1,0,0,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1];
ub = [1,0.6,0.01,1,1,1,1,1,1,1,1,1,1,1,1];
Starting values are:
x0 = [0.2, 0.6, 0.10, 0.86,-0.05,-0.026,0.031,-0.017,0.002,-0.002,-0.003,0.003,0.025,0.013,-0.009];
I use the following statements where I submit 756 user defined starting values (sv = 1260 by 15)
problem = createOptimProblem('lsqnonlin','objective',@(x) myfun(x, y, X),'x0',x0,'lb',lb,'ub',ub);
tpoints = CustomStartPointSet(sv);
[x,f] = run(ms, problem,tpoints);
I am able to obtain parameter estimates! Thanks to all the previous help on this group.
But now need to obtain the variance-covariance matrix for the parameters (15 by 15).
To do so, I would like to use bootstrap methodology:
Step 1: Draw a random sample of 1000 observations from [y,X] and define this sub-matrix as [y_1,X_1]
Step 2: Estimate non-linear squares using myfun for [y_1, X_1]
Step 3: Store the coefficients from Step 2 in a 15 by 1 matrix.
Step 4: Repeat steps 1,2, and 3, 1000 times.
Step 5: Compute standard errors as the standard deviation of the distribution of coefficient estimates.
I am aware of bootci on matlab, but not clear how I can use it in the above context.
best,
Srinivasan

Best Answer

The procedure you describe is not exactly using bootstrap methodology, at least as I understand it. In your Step 1 you say you will draw 1000 observations, but bootstrap methodology would say you should draw 18026 (with replacement), since that is how many rows you have in your original data set.
It matters whether you draw 1000 or 18026, because the parameter estimates you get from 1000 rows will vary more than the parameter estimates from 18026 rows. Thus, the standard errors that you estimate with your 1000-row procedure will be larger than is appropriate for estimating the standard errors of estimates based on 18026 rows.
I think you should be able to use bootci if you really want bootstrap samples with 18026 rows. You will have to write a bootfun that accepts X and y as parameters, and it will have to return a vector of the estimates appropriate for those X and y values (the function will specify your constraints, the starting values, etc, and run the problem. You will give bootci a pointer to your bootfun and the original X and y values, and it should do the rest. Might take a long time, though.