Solved – What’s the substitue of MSE in GAMs

binary datageneralized-additive-modellogisticmgcvmse

In case a linear regression, Mean Square Error (MSE) is defined as:

$$
\frac{1}{n-k-1}\sum_{i=1}^{n}{(y_i-\hat{y}_i)^2},
$$

where $n$ is the number of observations, $k$ is the number of independent variables, $y_i$ and $\hat{y}_i$ are the $i$th observed value of the response variable and its prediction, respectively.

Now what is the equivalence of MSE in a generalized additive model (GAM)? Especially in case of a binary GAM (which means a binary response variable)?

Edit: Please provide the corresponding option of your answer in package mgcv.

Best Answer

Typically, the model deviance or the Pearson statistic are substituted for RSS. As the latter can under-smooth heavily, deviance-based variants are preferred (Wood, 2017, p 261).

The unbiased risk estimator (UBRE) is one way to estimate MSE in GAMs with know scale parameter $\phi$. In the binary GAM UBRE would be (using the notation of Wood, 2017)

$$\mathcal{V}_a(\boldsymbol{\lambda}) = D(\hat{\beta}) + 2 \gamma \phi \tau$$

where $D(\hat{\beta})$ is the deviance of the model at the parameter estimates, $\tau$ is the model effective degrees of freedom, $\gamma$ is usually 1 but can be used to put an additional penalty on degrees of freedom (1.4 is commonly used in the smoothing literature, and 1.5 has special justification from the view point of double cross validation), $\phi \equiv 1$ from the binomial distribution, and $\boldsymbol{\lambda}$ is the vector of smoothing parameters. Hence $\mathcal{V}_a(\boldsymbol{\lambda})$ is the UBRE at the current values of the smoothness parameters.

The $D(\hat{\beta})$ component replaces the $\sum_{i=1}^n(y_i - \hat{y}_i)^2$ part of your equation.

If $\phi$ is not known but rather estimated instead, then generalised cross validation (GCV) can be used in place of UBRE. The corresponding GCV score is defined as:

$$\mathcal{V}_g(\boldsymbol{\lambda}) = nD(\hat{\beta}) / (n - \gamma \tau)^2$$

where $n$ is the number of observations.

If using mgcv in R, then the above are automatically used for smoothness selection via the default option method = "GCV.Cp". This criterion automatically selects between UBRE and GCV depending on whether the family implies a known, fixed value of $\phi$ or whether this is estimated from the data.

Alternative approaches are available:

  • method = "GACV.Cp" uses a related measure generalised approximate cross validation
  • method = "ML" and method = "REML" treat the smoothing problem as a mixed effects problem and estimate models and perform smoothness selection by maximising the likelihood or restricted likelihood of the model after converting the smooths into fixed and random effect terms.
  • others; see ?gam and argument method.

Wood, S. N. (2017) Generalized Additive Models: An Introduction with R. Second Edition. (Chapman and Hall/CRC).

Related Question