Solved – Interpretation Interaction in Cox Regression

cox-modelinteractionregression

I am estimating a discrete choice model with the help of cox regression in SPSS.

I would like to interact two continuous predicting variables.

The main effect of one of the ineracting variables is not displayed in the output( degreee of freedom reduced because of constant or linearly dependet covariates). Can I still interpret the interaction term? (The interaction is significant)

I assume the variable is omitted because it is constant per stratum.

The overall question is how consumers choose between different options within a choice set of 3 items.

One sub-question is if the price has an effect of the choice probability of an item. I also predicted that this effect is moderated by the range of the price of the given choice set. (i.e. the difference between the smallest and the highest price within the choice set). Therefore within one choice situation for 1 participant (1stratum) the variable “price_range” is constant. Since participants became different choice sets this variable varies over the strata.

Thank you


The setup of my research was that participants were confronted with different kind of choice sets. Each choice consisted of 3 bottles. Participants were asked to pick one out of the 3 bottles shown to them. In total I had 12 different choice sets that varied in size of the bottles and price. Participants were randomly confronted with 2 of these choice sets.
Now I want to estimate the influence of the attributes of a the choice options (size and price) on its choice probability. For that reason I chose cox regression, because this is (as far as I understood) the only way to do this kind of conditional logit in SPSS.

Below you find an extract how the data is set up. I set it up in a long format to be able to run the cox regression. The example shows 1 respondent making choices in 2 independent choice sets. So the stratum variable basically only groups each choice set represented to a partipant. As you can see, the range is constant for one choice set. I am not interested in the effect of the range on choice probability, only on the interaction effect of price_range and price.

enter image description here

Best Answer

This does not seem like a Cox regression problem. The Cox model is used to examine influences of variables on the time is takes for an event to happen. The time variable in your data seems to be 1 for the choice that was made and 2 for the choices that weren't made in each presentation/trial. It's not clear to me how Cox analysis with such a "time" variable would accomplish your goal, although if you have some reference to how it does so I would be glad to learn from it.

What you have is a set of trials involving a forced choice of 1 among 3 objects. Typically this would be analyzed as a multinomial logistic regression, and SPSS does have tools for that type of analysis. This examines how the probability of making a particular choice depends on predictor variables (in your case, price, price range, their interaction, and size) at each trial.

Complicating matters in this design is that over the entire study there were multiple SKU involved but only 3 were available in each trial. So the probability of choosing an SKU that wasn't presented is 0. Evidently the size of each SKU was fixed but the price was varied among trials. This type of design gets a bit beyond my personal expertise, but I will propose one way to proceed.

To analyze this as a multinomial regression, which seems most appropriate, it seems that you will have to include an additional predictor variable that indicates whether or not the SKU was available in a particular trial. That way, at least formally, all the SKU are included in the model for each trial. Then you proceed as follows, with one data line for each trial:

The output variable for each presentation is the SKU that was chosen.

The price, price_range, some type of interaction term between them, and the size are included as predictor variables. It's not clear that including the Respondent ID will help much, as each Respondent only saw 2 of the 12 different types of presentation, but include that if you think it is important (it may be difficult to interpret, however).

A set of variables indicating whether a particular SKU was available for choice at that trial is added as predictors. The "stratum" per se is then no longer needed as a predictor.

You should not just ignore the range_price variable as a possible predictor. It's hard to interpret interaction terms without also knowing the main effects.

There are a few dangers here whose importance may be affected by the details of your design. One is that although you have prices in numbers you only have a limited set of prices and price ranges, so it might be difficult to interpret your data directly in terms of change of odds per change in price. This may be a particular issue with your interaction term. A second is that the particular combinations of sizes, prices, and price ranges you used might have some internal relations that then pose problems like those that arose when your price_range and strata ended up being just two ways of presenting the same variable.

If this answer doesn't help, you might want to pose a new question based more directly on your experimental design, such as "multinomial regression with different choices among trials." If possible, if you do pose a new question present the choices available in each of your 12 "strata".