This doesn't exactly answer your question, but it might still solve your problem of needing to calculate risk ratios. The epiR package allows you to calculate risk ratios.
I could not get your example to work (see my comment to your question), so here is an example from the package's documentation:
library(epiR) # Used for Risk ratio
library(MASS) # Used for data
dat1 <- birthwt; head(dat1)
## Generate a table of cell frequencies. First set the levels of the outcome
## and the exposure so the frequencies in the 2 by 2 table come out in the
## conventional format:
dat1$low <- factor(dat1$low, levels = c(1,0))
dat1$smoke <- factor(dat1$smoke, levels = c(1,0))
dat1$race <- factor(dat1$race, levels = c(1,2,3))
## Generate the 2 by 2 table. Exposure (rows) = smoke. Outcome (columns) = low.
tab1 <- table(dat1$smoke, dat1$low, dnn = c("Smoke", "Low BW"))
print(tab1)
## Compute the incidence risk ratio and other measures of association:
epi.2by2(dat = tab1, method = "cohort.count",
conf.level = 0.95, units = 100, outcome = "as.columns")
In these models, the linear predictor is a latent variable, with estimated thresholds $t_i$ that mark the transitions between levels of the ordered categorical response. The plots you show in the question are the smooth contributions of the four variables to the linear predictor, thresholds along which demarcate the categories.
The figure below illustrates this for a linear predictor comprised of a smooth function of a single continuous variable for a four category response.
The "effect" of body size on the linear predictor is smooth as shown by the solid black line and the grey confidence interval. By definition in the ocat
family, the first threshold, $t_1$ is always at -1
, which in the figure is the boundary between least concern and vulnerable. Two additional thresholds are estimated for the boundaries between the further categories.
The summary()
method will print out the thresholds (-1
plus the other estimated ones). For the example you quoted this is:
> summary(b)
Family: Ordered Categorical(-1,0.07,5.15)
Link function: identity
Formula:
y ~ s(x0) + s(x1) + s(x2) + s(x3)
Parametric coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.1221 0.1319 0.926 0.354
Approximate significance of smooth terms:
edf Ref.df Chi.sq p-value
s(x0) 3.317 4.116 21.623 0.000263 ***
s(x1) 3.115 3.871 188.368 < 2e-16 ***
s(x2) 7.814 8.616 402.300 < 2e-16 ***
s(x3) 1.593 1.970 0.936 0.640434
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Deviance explained = 57.7%
-REML = 283.82 Scale est. = 1 n = 400
or these can be extracted via
> b$family$getTheta(TRUE) ## the estimated cut points
[1] -1.00000000 0.07295739 5.14663505
Looking at your lower-left of the four smoothers from the example, we would interpret this as showing that for $x_2$ as $x_2$ increases from low to medium values we would, conditional upon the values of the other covariates tend to see an increase in the probability that an observation is from one of the categories above the baseline one. But this effect is diminished at higher values of $x_2$. For $x_1$, we see a roughly linear effect of increased probability of belonging to higher order categories as the values of $x_1$ increases, with the effect being more rapid for $x_1 \geq \sim 0.5$.
I have a more complete worked example in some course materials that I prepared with David Miller and Eric Pedersen, which you can find here. You can find the course website/slides/data on github here with the raw files here.
The figure above was prepared by Eric for those workshop materials.
Best Answer
No, you don't need to do a further transformation to the response scale as you have already specified
type = "response"
in your call topredict()
.Note that you can't use the standard errors returned by adding
se.fit = TRUE
to yourpredict()
call if you continue to usetype = "response"
because there's no easy way to do it once everything is transformed back to the response scale. If you do want to form confidence/credible intervals, usetype = "link", se.fit = TRUE
and then create the confidence/credible interval and then manually useexp()
to back transform the fitted/predicted values and the interval to the response scale.Or you could use my {gratia} package and its
fitted_values()
function which takes care of this for you and withscale = "response"
would give you an interval on the response scale that has been created properly (following the brief steps I outlined above).