Data Visualization – Understanding mgcViz::pterm() in GAM Parametric Coefficients

data visualizationgeneralized-additive-modelggplot2mgcvparametric

I built a GAM with two categorical variables and two smooth terms, following this structure:

model <- ik ~ population_id_cat + s_status + s(n_locs, bs = "re") + s(animal_id, bs = "re")

The model summary for the parametric coefficients is:

Parametric coefficients:
                                Estimate Std. Error t value Pr(>|t|)    
(Intercept)                      1.90509    0.07798  24.432  < 2e-16 ***
population_id_catB              -1.08790    0.13199  -8.242 2.27e-16 ***
population_id_catC               0.06718    0.08880   0.757 0.449368    
population_id_catD              -0.27599    0.07689  -3.589 0.000336 ***
population_id_catE              -0.33914    0.13555  -2.502 0.012391 *  
population_id_catF              -0.71104    0.07586  -9.373  < 2e-16 ***
population_id_catG              -0.39243    0.07276  -5.393 7.32e-08 ***
population_id_catH               0.31115    0.18431   1.688 0.091449 .  
population_id_catI               0.06530    0.10203   0.640 0.522174    
s_status_2a_m                   -0.16331    0.05263  -3.103 0.001930 ** 
s_status_2fam                   -0.36656    0.05200  -7.050 2.10e-12 ***

When I plot these coefficients with the mgcViz::pterm() function, the values do not seem to match the std. error (but their significance does match with the p-value provided).

For example, the plot for the parametric coefficients mentioned above:

gviz_model <- getViz(model)
gviz_model_1 <- pterm(gviz_model, 1)
plot(gviz_model_1) + l_ciBar(colour = "blue") + l_fitPoints(colour = "red") + l_rug(alpha = 0.3)

Question 1:

What values are actually plotted in this plot? They seem to match the "Estimate" value from the summary, but they do not match the "Std. Error" (e.g., look at population H, according to the std error, the o value would be significant and therefore would be above 0).

If I plot the Estimate+-Std. Error in a ggplot (which is my goal, because I want to plot these parametric values for two models in the same plot), I get this instead:

gviz_model_pop <- termplot(gviz_model, se = TRUE, plot = FALSE)$population_id_cat
ggplot(gviz_model_pop , aes(x=x, y=y)) + 
     geom_errorbar(aes(ymin=y-se, ymax=y+se), width=.15, size=1, position=position_dodge(0.2))  +
     geom_point(position=position_dodge(0.2), size=3.5, shape=16)

Similar trends, but the interval is definitely not the same (and influences the significance of each category).

Question 2:

If I want to plot these parametric coefficients for two models in the same plot, and I would need to use the output from pterm(), instead of extracting the estimates/std error as I did previously, how can I do this? Preferably with a ggplot output.

Question 3:

If I want to plot these parametric coefficients including the intercept (to make the plot more interpretable), as mentioned here in "Transformed standard errors (2)", how do I do it? Do I need separate calculations or is there a function for the parametric coefficients too? Preferably with a ggplot output.

Thank you very much in advance for any help!

Best Answer

Q1

The blue bars are confidence intervals formed (likely) as +/- 2 * Std. err

Q2

I think this would be easier with parametric_effects() from my {gratia} package; what pterm() seems to be doing is wrapping up the model object with some extra information, which doesn't seem to be the information computed for the plot.

Using the example from ?pterm, we have

library("gratia")

set.seed(3)
dat <- gamSim(1,n=1500,dist="normal",scale=20)
dat$fac <- as.factor( sample(c("A1", "A2", "A3"), nrow(dat), replace = TRUE) )
dat$logi <- as.logical( sample(c(TRUE, FALSE), nrow(dat), replace = TRUE) )
bs <- "cr"; k <- 12
b <- gam(y ~ x0 + x1 + I(x1^2) + s(x2,bs=bs,k=k) + fac + x3:fac + I(x1*x2) + logi,
         data=dat)

p <- parametric_effects(b)

Which gives

> p <- parametric_effects(b)                                                  
Interaction terms are not currently supported.
> p                                                                           
# A tibble: 6,005 × 6
   term  type       value level partial    se
   <chr> <chr>   <I<dbl>> <fct>   <dbl> <dbl>
 1 x0    numeric    0.168 NA     -0.310 0.309
 2 x0    numeric    0.808 NA     -1.49  1.48 
 3 x0    numeric    0.385 NA     -0.711 0.708
 4 x0    numeric    0.328 NA     -0.605 0.602
 5 x0    numeric    0.602 NA     -1.11  1.11 
 6 x0    numeric    0.604 NA     -1.12  1.11 
 7 x0    numeric    0.125 NA     -0.230 0.229
 8 x0    numeric    0.295 NA     -0.544 0.541
 9 x0    numeric    0.578 NA     -1.07  1.06 
10 x0    numeric    0.631 NA     -1.17  1.16 
# ℹ 5,995 more rows
# ℹ Use `print(n = ...)` to see more rows

The confidence is easy to add:

p <- p |>
  dplyr::mutate(lower_ci = partial - (1.96 * se), upper_ci = partial + (1.96 * se))

I would also append a model column to indicate which model this is:

p <- p |>
  dplyr::mutate(model = rep("Model 1", nrow(p))

If you repeat that for the other model, to create p2 say, you should be able to bind the two objects together with

p_both <- dplyr::bind_rows(p, p2)

You can also select which terms you want to extract the info for using the terms argument to parametric_effects().

Q3

The best way to do this is to just predict from the model including only the constant term and parametric term of you choice. In the example from ?pterm that I used above, there is a factor variable fac which we'll use to illustrate the point.

# get a data slice where only `fac` varies
# hold all other variables are representative values
ds <- data_slice(b, fac = evenly(fac)) |>
  dplyr::mutate(logi = as.logical(logi)) # <-- work around bug

# use fitted_values() to predict from the model using only the effects
# of `fac` and the constant term
fv <- fitted_values(b, data = ds, terms = c("(Intercept)", "fac"))
fv

There's a bug in data_slice() currently; data_slice() seems to be converting logical variables to numeric. We work around that by coaxing them back to logical. And we only need this to keep predict.gam() happy; none of the other variables but fac will be used in the predictions we computed in fv.

Having done the above fv is:

> fv                                                                          
# A tibble: 3 × 10
  fac      x0    x1    x2 logi     x3 fitted    se  lower upper
  <fct> <dbl> <dbl> <dbl> <lgl> <dbl>  <dbl> <dbl>  <dbl> <dbl>
1 A1    0.511 0.521 0.493 FALSE 0.476   5.75  2.55  0.753 10.8 
2 A2    0.511 0.521 0.493 FALSE 0.476   4.60  2.54 -0.367  9.57
3 A3    0.511 0.521 0.493 FALSE 0.476   2.94  2.63 -2.21   8.09

Here, as fac only had three levels, we get the three values needed for the plot. The fitted column is now has the intercept added to each of the parametric effects, hence A1, being the reference level has the same value as the intercept:

> coef(b)[1]                                                                  
(Intercept) 
   5.753834

while the other values in fitted are (Intercept) plus the respective coefficient for the other levels. The lower and upper columns contain the confidence interval as it would be plotted in the output from mgcViz. Note that fitted_values() returns, by default, predicted values on the response scale. If you have a non-Gaussian family, you can choose to keep the values on the link scale using scale = "link" in the call to fitted_values()

Related Solutions

Solved – Plotting gam model output – not component smooth functions

@Kate when I run your code, I get the following error from the last line, which is the key to the solution:

Error in eval(expr, envir, enclos) : object 'Loc' not found In addition: Warning message: In predict.gam(mod1, newdata = pred) : not all required variables have been supplied in newdata!

If you want to predict from a model, any model not just gam(), you must provide all the variables in the dataframe to predict from (the argument newdata).

pred <- data.frame(Doy = 1:365,  Tod=median(DatNew$Tod),
    Loc=factor('NewYork',levels=levels(DatNew$Loc)))
predict(mod1, newdata = pred)

And this works giving a predicted value of temperature for each day of the year in New York, assuming that they are measured at the median time of day. I tweaked your Doy values to be simply from 1 to 365, as presumably you only want one prediction for each day of the year. The value is almost completely flat because you simulated the data with no effect of Doy on temp, and it is almost exactly the expected value for a uniform(0,1) distribution, which is how you generated your random temperatures.

Generalized Additive Model – Understanding GAM Smoother vs Parametric Term and Concurvity Difference

The concurvity moves from the stated smooth terms to the parametric terms, which concurvity groups in total under the para column of the matrix or matrices returned.

Here's a modified example from ?concurvity

library("mgcv")
## simulate data with concurvity...
set.seed(8)
n<- 200
f2 <- function(x) 0.2 * x^11 * (10 * (1 - x))^6 + 10 *
            (10 * x)^3 * (1 - x)^10
t <- sort(runif(n)) ## first covariate
## make covariate x a smooth function of t + noise...
x <- f2(t) + rnorm(n)*3
## simulate response dependent on t and x...
y <- sin(4*pi*t) + exp(x/20) + rnorm(n)*.3

## fit model...
b <- gam(y ~ s(t,k=15) + s(x,k=15), method="REML")

Now add a linear term and refit

x2 <- seq_len(n) + rnorm(n)*3
b2 <- update(b, . ~ . + x2)

Now look at the concurvity of the two models

## assess concurvity between each term and `rest of model'...
concurvity(b)
concurvity(b2)

These produce

> concurvity(b)
                para       s(t)      s(x)
worst    1.06587e-24 0.60269087 0.6026909
observed 1.06587e-24 0.09576829 0.5728602
estimate 1.06587e-24 0.24513981 0.4659564
> concurvity(b2)
              para      s(t)      s(x)
worst    0.9990068 0.9970541 0.6042295
observed 0.9990068 0.7866776 0.5733337
estimate 0.9990068 0.9111690 0.4668871

Note that x2 is essentially a noisy version of t:

> cor(t, x2)
[1] 0.9975977

and hence the concurvity is gone up from essentially 0 in b to almost 1 in b2.

Now if we add x2 as a smooth function instead...

concurvity(update(b, . ~ . + s(x2)))

we see that the para entries return to being very small and we get a measure for the spline term s(x2) directly

> concurvity(update(b, . ~ . + s(x2)))
                 para      s(t)      s(x)     s(x2)
worst    1.506201e-24 0.9977153 0.6264654 0.9976988
observed 1.506201e-24 0.9838018 0.5893737 0.9963857
estimate 1.506201e-24 0.9909506 0.4921592 0.9943990

This is just how the function works in terms of the parametric terms; the focus is on the smooth terms.

Note: ~~you are specifying gamma but fitting using REML. gamma only affects GCV and UBRE/AIC methods of smoothness selection, so you can remove this argument as it is having zero effect on the model fits.~~ From version 1.8-23 of mgcv, the gamma argument no also affects models fitted using REML/ML, where smoothness parameters are selected BY REML/ML as if the sample size was $n/\gamma$ instead of $n$.

Best Answer

Q1

Q2

Q3

Related Solutions

Solved – Plotting gam model output – not component smooth functions

Generalized Additive Model – Understanding GAM Smoother vs Parametric Term and Concurvity Difference

Related Question