I saw a comment here by @gavin-simpson y-axis values in plot(gam) , I don't understand why trans = exp
is used instead of trans = plogis
, how do u decide which one to use?
The code I am using
library(mgcv)
b <- gam(outcome ~ s(week, k = 4, fx = TRUE, by = food) + food, data = df1, family = betar(link="logit"), method = "REML")
summary(b)
My dependent variable is in proportion from 0 to 1. Week is a form of timeline measure, and food is a categorical variable.
When I do trans = plogis
, I get the plot below
plot(b, pages = 1, trans = plogis, shift = coef(b)[1])
How shall I interpret this plot here with trans = plogis
?
When I do trans = exp
, I get the plot below
plot(b, pages = 1, trans = exp, shift = coef(b)[1])
My question is why am I getting values larger than 1 when I do trans = exp
?
I am new to GAM and still learning, any guidance is appreciated.
Best Answer
In the linked question they were talking about negative binomial models, which by default use the $\log$ as the link-function, since it deals with counts and frequencies just like Poisson-models. Beta-models deal with stuff like probabilities which want a link-function that deals with the limits at 0 and 1, the default being $\text{logit}$.
What the default link-function is for your family of models in R can be found with
help
, e.g.:help(nb)
, orhelp(betar)
. Other link-functions can be specified, but this is advanced.The inverse link-function is generally assumed to be easy to find.
plogis
being R for $\text{logit}^{-1}$ is a little weird, but since it rises strictly monotonically from 0 to 1 it works as a CDF and so R implemented it as one. Probit models pull the same trick in the other direction: https://en.wikipedia.org/wiki/Probit_model