Q: "how do I interpret the exp(coef) values in the interaction pairs"
A: You don't. You use predict(). And you don't separately examine the p-values, either. The separate p-values are pretty much meaningless. You use anova()-type analyses to compare nested models, ideally with some method to penalize the process so you are not susceptible to multiple comparisons artifacts.
(Note: it is possible to compare some of the coefficients. The single level "main-effects" fishsize
coefficients in an interaction model would be the expected log(hazard ratios) for the other fishsize
s compared with the baseline levels but only for the reference value of virus. And similarly for the "main-effects" virus
coefficients.)
What's plotted starts with a variance-weighted transformation of the Schoenfeld residuals for a covariate, into what are called "scaled Schoenfeld residuals." Those are then added to the corresponding time-invariant coefficient estimate from the Cox model under the proportional hazards (PH) assumption and smoothed. The result is a plot of an estimate of the regression coefficient for the covariate over time. If the plot is reasonably flat, the PH assumption holds. Take this one step at a time.
Risk-weighted covariate averages and covariances
You start by determining, for each event time, the risk-weighted averages of covariate values and the corresponding risk-weighted covariance among covariate values over all individuals at risk at that time. That's essentially a part of the model-fitting process, anyway. The risks used for the weighting are simply the corresponding hazard ratios for the individuals at that time, the exponentiated linear-predictor values from the model.
Schoenfeld residuals
The Schoenfeld residuals are calculated for all covariates for each individual experiencing an event at a given time. Those are the differences between that individual's covariate values at the event time and the corresponding risk-weighted average of covariate values among all those then at risk. The word "residual" thus makes sense, as it's the difference between an observed covariate value and what you might have expected based on all those at risk at that time.
Scaled Schoenfeld residuals
The Schoenfeld residuals are then scaled inversely with respect to their (co)variances. The scaled values at an event time for an individual come from pre-multiplying the vector of original Schoenfeld residuals by the inverse of the corresponding risk-weighted covariate covariance matrix at that time. You can think of this as down-weighting Schoenfeld residuals whose values are uncertain because of high variance.
The "plot of scaled Schoenfeld residuals"
Although the plot you show is generally called a "plot of scaled Schoenfeld residuals," that's not quite right.
The importance of the scaled Schoenfeld residuals comes from their associations with the time-dependence of a Cox regression coefficient. If $s_{k,j}^*$ is a scaled Schoenfeld residual for covariate $j$ at time $t_k$ and the estimated time-fixed Cox regression coefficient under PH is $\hat \beta_j$, then the expected value of $s_{k,j}^*$ is approximately the deviation of the actual coefficient value at time $t_k$, $\beta_j(t_k)$, from the PH-based estimate:
$$E(s_{k,j}^*) + \hat \beta_j \approx \beta_j(t_k).$$
That was shown by Grambsch and Therneau in 1994. The y-axis values of the plot for covariate $j$ are the sums of the scaled Schoenfeld residuals with the corresponding PH estimate $\hat \beta_j$.
Simple answer to the question
The smoothed plot is thus an estimate of the time dependence of the coefficient for the covariate $j$, $\beta_j(t_k)$. In your case, the plot indicates that your biomarker is most strongly associated with outcome at early times, dropping off to almost no association beyond a time value of 50-60.
The above is pretty much based on Therneau and Grambsch. Section 6.2 presents the plotting of scaled Schoenfeld residuals, with ordinary Schoenfeld residuals described in Section 4.6 and the formulas for risk-weighted covariate means and covariances in Section 3.1 (equations 3.5 and 3.7, respectively).
Best Answer
First, it's important to distinguish between the Schoenfeld residuals themselves and the score tests for trends of residuals over time that are used (starting with version 3 of the
survival
package) to evaluate deviations from proportional hazards (PH). Second, with respect to those score tests, it might be simpler to think about the tests on individual coefficients or on groups of coefficients (multiple-level categorical predictors, spline coefficients, etc) as special cases of the global test that's effectively done "under the hood" bycox.zph
whether you ask for it or not. Third, the multiple residuals associated with a single predictor in the model can be combined with their associated regression-coefficient estimates to obtain the corresponding net residual in the linear predictor of the model.Schoenfeld residuals
This page explains that
If there are multiple coefficients associated with a predictor in a model, as in the situations you describe, there are multiple "covariates" in the above sense associated with that predictor, each with its own estimated regression coefficient.
The residuals are scaled inversely with respect to their covariances to get scaled Schoenfeld residuals $s^*_{k,j}$ for covariate $j$ at event time $t_k$. Grambsch and Therneau showed that the expected value of that residual is the difference between the time-fixed Cox-model coefficient estimate $\hat\beta_j$ and a potentially time-varying coefficient value at that event time:
$$E(s_{k,j}^*) + \hat \beta_j \approx \beta_j(t_k).$$
If PH holds, $\beta_j$ is constant over time. In that case, there should be no trend of those residuals over time.
Score tests for trends
Section 6.2 of the Therneau and Grambsch text explains how to test whether there is evidence of time trends in coefficients that would be inconsistent with PH. First, define some transformation of time $g(t)$; that's the
transform
argument tocox.zph()
. Then, for covariate $j$, evaluate the following regression of the residual-based $\beta_j(t)$ against that function of time:$$\beta_j(t) = \beta_j + \theta_j(g(t)-\bar g_j) $$
where $\bar g_j $ is the mean of the transformed event times. Then, for the set of $p$ covariates:
As Therneau and Grambsch explain, this isn't done with a regression model per se but rather with a set of calculations that leads to a multi-parameter score test of that joint hypothesis. The test statistic, evaluated against $\chi^2_p$, is based on the $p \times 1$ score vector $U$ and the $p \times p$ Fisher information matrix $I$ evaluated at the above null hypothesis: the quadratic form $U^T I^{-1} U$. That's the global test.
Once you have $U$ and $I$ you can perform tests on subsets of covariates or on individual covariates, by restricting the calculation of the test statistic to the corresponding elements of $U$ and $I$ and evaluating against $\chi^2$ with degrees of freedom equal to the number of coefficients involved. In that sense, the test reported for a multi-coefficient predictor is a special case of the global test based on the subset of coefficients associated with it, and that for an individual coefficient for such a predictor is a special case of the test on the subset.
Combined scaled Schoenfeld residuals
The (unscaled) Schoenfeld residuals are first calculated for each individual event time and covariate. When
terms=TRUE
, the set of residuals for each event time for a multi-coefficient predictor is then combined as the inner product of the corresponding residuals and Cox model regression coefficients. That puts together all the residuals associated with that predictor, for that individual at that event time, into their net effect on the overall linear predictor of the Cox model. I've annotated below the critical code, which you can see by typingcox.zph
at the command prompt:That's done before additional calculations including scaling the residuals, so you can't readily reproduce that by a doing a similar calculation on the reported scaled residuals.