Solved – Backward Stepwise Selection

regression

I need to implement a backward stepwise regression. I read the chapter from "The Elements of Statistical Learning" however the explanation is poor here:

Backward-stepwise selection starts with the full model, and
sequentially deletes the predictor that has the least impact on the
fit. The candidate for dropping is the variable with the smallest
Z-score
This is taken from chapter 3.3.2 pg:59

I need to know what is the "full model" mentioned here? What will I delete at each level acording to the Z-score? I need to know how this algorithm works step by step.

EDIT: I try to implement the code on matlab. I don't know I'm totally wrong or not :/ Comment on this function please

function [] = BackwardStepWise(X,y,N,p)  % X is Nxp matrice y is Nx1 matrice

X = [ones(N,1) X];

% vector for holding column numbers 
v= 1:p;

for k=p-1:-1:1

        %create matrix with the selected columns
        T= [ones(N,1)];
        for j=1:k
            %add column to the matrix
            T = [T X(:,v(1,j)+1)];
        endfor
        %evaluate beta
        beta = inv(T' * T) * T' * y;

    %calculate Z-scores for each column
    sigmahat_sq = (y-T*beta)'*(y-T*beta)/(N-p-1);
    TT = inv(T'*T);
        Zmin = 100000;
        Zminindex = -1;
    for i=1:size(beta,2)
        z(i) = beta(i)/sqrt(sigmahat_sq*TT(i,i));
                if(z(i) < Zmin)
            Zmin = z(i);
            Zminindex = i;
        endif
    endfor

    %drop column which has the smallest Z score (edit v vector)
    v2 = [];
    for i=1:size(v,2)
        if(i == Zminindex)
            continue;
        else
        v2 = [v2 v(i)]; 
        endif       
    endfor  
    v = v2;

endfor

endfunction

Best Answer

  1. Set up an exit criterion for the p-value. Any independent variable with a p-value higher than this criterion will be removed. There isn't any golden rule on what to set, for exploratory purpose you may see p > 0.2 being removed. Someone may used 0.05, etc.
  2. Set up your full model. Generally, it's the model that contains all independent variables from which you wish to select the predictive bunch.
  3. Fit the full model.
  4. Check the p-values (or t-statistics). If all p-values are less than the exit criterion, then it's the final model. If any of them exceeds the exit criterion, then the one with the highest p-value (aka lowest t-statistics) will be removed. This is pertinent to your "dropping the variable with the smallest z-score."
  5. Fit the model with the remaining independent variables again, repeat steps 4 and 5 until either no independent variable is left or no independent variable has a p-value larger than the exit criterion.

Here is an example using Stata:

. sysuse auto
. stepwise, pr(.2): reg mpg weight turn headroom foreign price
                      begin with full model
p = 0.9238 >= 0.2000  removing price
p = 0.7047 >= 0.2000  removing headroom
p = 0.2045 >= 0.2000  removing turn

      Source |       SS       df       MS              Number of obs =      74
-------------+------------------------------           F(  2,    71) =   69.75
       Model |   1619.2877     2  809.643849           Prob > F      =  0.0000
    Residual |  824.171761    71   11.608053           R-squared     =  0.6627
-------------+------------------------------           Adj R-squared =  0.6532
       Total |  2443.45946    73  33.4720474           Root MSE      =  3.4071

------------------------------------------------------------------------------
         mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      weight |  -.0065879   .0006371   -10.34   0.000    -.0078583   -.0053175
     foreign |  -1.650029   1.075994    -1.53   0.130      -3.7955    .4954422
       _cons |    41.6797   2.165547    19.25   0.000     37.36172    45.99768
------------------------------------------------------------------------------

The independent variables are weight, turn, headroom, foreign, and price. The dependent variable is mpg. The exit criterion is p > 0.2. The first got removed was price (p = 0.9238), followed by headroom (p = 0.7047) and turn (p = 0.2045). The remaining ones have p < 0.2, so they stay.