Solved – Multi-adaptive Regression Splines (MARS) equation interpretation

marsnonlinear regressionrregressionsplines

I was curious about certain aspects of multi-adaptive regression splines (MARS or earth in R). Taking the equation of MARS from the wiki page:

  ozone = 5.2 +
          + 0.93max(0,temp - 58)
          - 0.64max(0,temp - 68)
          - 0.046max(0,234-ibt)
          - 0.016max(0,wind-7)max(0,200-vis)

My questions are:

1) How to interpret the above equation? If I understand it correctly, for the following value of temp = 52, ibt = 230, wind = 8 and vis = 190, the above equation will become:

      ozone = 5.2 +
          + 0.93max(0,52 - 58)
          - 0.64max(0,52- 68)
          - 0.046max(0,234- 230)
          - 0.016max(0,8-7)max(0,200-190)

    ozone = 5.2 +
          + 0.93max(0,-6)
          - 0.64max(0,-16)
          - 0.046max(0,4)
          - 0.016max(0,1)max(0,10)

    ozone = 5.2 +
          + 0
          - 0
          - 0.184
          - 0.16

Is this correct?

2) Before running earth (MARS), do the predictor variables have to uncorrelated? What is the way to go about it: remove the correlated predictor variables and then run MARS or run MARS using all predictors and somehow MARS deal with the correlation

3) In R, for earth equation, how do I take out the final equation in step 1 for my model

4) Lets say I run two MARS model: first model predictor x1, x2 and x3 are retained in the final model and in second model x1 and x2 are retained in the final model. Is there any way I can combine the two MARS equation to have a single equation presumably where parameters of x1 and x2 are average of the two models and x3 on its own.

EDIT

To elaborate on (4), let's say I collect some data and and develop a MARS model

 ozone1 = 5.2 + 0.93max(0,temp - 58) - 0.64max(0,temp - 68)- 0.046max(0,234-ibt)- 0.016max(0,wind-7)max(0,200-vis)

I go back and collect another set of data and develop a second model:

  ozone2 = 4.6 + 0.89max(0,temp - 58) - 0.033max(0,234-ibt)- 0.016max(0,wind-7)max(0,200-vis)

Can I combine the two equation in a single equation something like this:

   ozone.fin = (5.2 + 4.6)/2 + ((0.93 + 0.89)/2)max(0,temp - 58) - 0.64max(0,temp - 68) - (0.046+0.033/2)max(0,234-ibt) - 0.016max(0,wind-7)max(0,200-vis)

Thaks

Best Answer

1) This is correct. It would be best to interpret the functions locally, as different areas in the domain have different slopes.

2) MARS exhibits problems in choosing among predictor variables when multicollinearity is present. To improve the ability of MARS to deal with multicollinearity, principal components can be used to reduce the dimensionality of the input variables before invoking MARS. However, slightly correlated predictors should not give problems.

3) Use the summary function to see the results

4) I do not see exactly what you mean there? Can you explain it any further?