Generally speaking, you should treat age as a quantitative variable, assuming you have the actual ages and not age brackets. There are several reasons for this. Perhaps most importantly, if you use age as a categorical variable, you typically would need $c-1$ variables to represent the age categories, $c$, in a regression model, and would lose degrees of freedom for each of these categories. This results in less powerful tests. On the other hand, using a single quantitative/numeric variable age
requires only a single variable and a single degree of freedom.
Age
as a quantitative variable contains more information than as a categorical variable. If you were to represent age
as a categorical variable, then you are doing away with the natural ordering of the ages you'd have by leaving it as a quantitative variable. In other words, a model with categorical ages is unable to tell that 70 years old is closer to 80 years old than 5 years old (because 70 comes 10 before 80, but if you modeled age
as a category, there is no information that indicates to your model that category A -- which might represent your first age category -- comes before category C, which might represent another age category). You will get different coefficient estimates for a quantitative age model than a qualitative age model because the models make different assumptions.
There are several other important reasons why it's generally not a good idea to treat your quantitative measures as categories. Frank Harrell has a good list of additional reasons here.
There is no need to treat age as a category to estimate risks associated with a subject's age. You can simply estimate the risk at any given age by multiplying the estimated coefficient for age
by the subject's age (in years) and exponentiating.
Update to your "EDIT 1"
What I don't understand, ..., is why would
I add a quadratic or cubic term to my model? What will that achieve?
First, let me say that I'm glad to see you've decided to use age as a quantitative variable. I think all of us applaud you on that decision.
The reason we've suggested the possibility of trying quadratic or cubic terms to your model is that age may not have a strictly linear relationship to your outcome. Without loss of generality to your exact problem and model, I think it's easier and more instructive to think about this in terms of a simple linear regression model.
Let's assume you had a simple linear regression model in which you regressed weight
on age
. If you included age
as a linear term (e.g. $\beta_{age} \times X_{age}$) in your model, then you are assuming that weight increases steadily as as person ages, no matter how old a person gets (see Linear Model graph below). This may not be the appropriate functional relationship. Instead, in reality, what typically happens is that a person gains weight as they age, but as they enter old age they begin to lose weight. This would not be well modeled as a linear function, because of this drop-off in weight in old-age. Instead, if one included a quadratic term (e.g. $\beta_{age} \times X_{age}^2$) in addition to the linear term, this could better capture the non-linearity and quadratic nature of the relationship between weight and age. Using an $X_{age}^2$ term allows the regression model to predict an increasing weight as one ages up to a point, and then the model will start to predict a decrease in weight as one ages (see Quadratic Model graph below). If you simply included a linear term, your model would not be able to capture this drop-off in weight in old age. Instead, your model will simply continue predicting increases in weight well into old age. The same concept applies to your model.
Besides including quadratic or linear terms in your model, you may also want to explore the use of splines or generalized additive models (GAMs) to model these types of non-linear relationships.
Linear Model:
Quadratic Model:
PS. Keep in mind I just made up these graphs. They don't really represent real weights and ages in real life. For example, in real life someone doesn't only weigh 8 pounds upon reaching 100, but I think you get the general idea.
Best Answer
The hazard ratios are always relative to the reference group, but you need to be careful what the definition of "reference group" is in your model.
Your second model is almost definitely the better way to interpret the data, but if the treatments are not always given in the same sequence, you may want to include an indicator like "prior exposure to treatment 1" or "prior duration of exposure to treatment 1" if there could be carry-over.
In general, there is no reason that predictors in a Cox model must be binary; there are many examples of continuous measures (e.g. drug exposure).