Solved – Comparing odds ratios

odds-ratiostatistical significance

I have data for two different sets of patients, and I divided each set into two subgroups (males and females, let's say). For each patient set, I compute the odds ratio of receiving a treatment for males versus females. Is there a way to compare the odds ratios of the two patient sets and say that one is significantly higher than the other, even though the odds ratios correspond to two entirely different sets of people?

I am new to this, so any recommendation of resources that address this problem would be appreciated. Thanks.

Best Answer

It is possible to compute two types of odds ratios in your scenario:

  1. Unadjusted odds ratios of receiving a treatment;
  2. Adjusted odds ratios of receiving a treatment.

For the first type, you could fit a binary logistic regression to your data, which includes the following variables:

Outcome variable: Binary variable, indicating whether the patient received the treatment (1) or did not receive the treatment (0);

Predictor variable: Binary variable, indicating whether the patient is Male (1) or Female (0).

The exponentiated regression coefficient of the predictor variable sex in this model will represent the odds ratio of receiving a treatment for males versus females. If this exponentiated regression coefficient is estimated from the data to be 1.2, say, then you would conclude that the odds of receiving a treatment are 20% higher for males compared to females, since (1.2-1)x100% = 20%. If it comes out to be 0.8, say, then you would conclude that the odds of receiving a treatment are 20% lower for males compared to females, since (0.8 - 1)x100% = -20%.

For the second type, you will need to expand your binary logistic regression model to include one or more predictor variables (e.g., age, cholesterol level) in addition to the sex variable. The exponentiated regression coefficient of the predictor variable sex in this model will represent the odds ratio of receiving a treatment for males versus females, adjusted for age and cholesterol level. If this exponentiated regression coefficient is estimated from the data to be 1.2, say, then you would conclude that, among patients having the same age and the same cholesterol level, the odds of receiving a treatment are 20% higher for males compared to females, since (1.2-1)x100% = 20%. If it comes out to be 0.8, say, then you would conclude that, among patients having the same cholesterol level and same age, the odds of receiving a treatment are 20% lower for males compared to females, since (0.8 - 1)x100% = -20%.

Adjusting for additional characteristics in the binary logistic regression model affords you to compare male and female patients who are the same with respect to those characteristics, thus reducing some of the "noise" that may affect your males versus females comparison of the odds of receiving a treatment.

Edit:

My explanation above refers to a single set of patients. If you are interested in computing the odds ratio of receiving a treatment separately for each set of patients, you will need to expand your model (as already suggested in other answers here) so that it includes a Group (or Patient Set) variable in addition to a Sex variable. You can define this variable to be 0 for your first set of patients and 1 for your second set. The model will be applied to your combined data for patients in both sets, so you don't really need your Step 1. Your Step 2 model for the combined data is in effect fitting four different models simultaneously. You can represent all 4 models at once via the following equation:

log odds of receiving a treatment = beta0 + beta1*Sex + beta2*PatientSet + beta3*Sex*PatientSet

where the odds of receiving a treatment are expressed as log(p/(1-p)) and p is the conditional probability of receiving a treatment given the patient's sex and patient set.

The above model can be simplified as follows for each gender by patient set combination:

Females in Patient Set #1:

log odds of receiving a treatment = beta0  or, equivalently, 

    odds of receiving a treatment = exp(beta0)

Males in Patient Set #1:

log odds of receiving a treatment = beta0 + beta1 or, equivalently,  

    odds of receiving a treatment = exp(beta0 + beta1)

Females in Patient Set #2:

 log odds of receiving a treatment = beta0 + beta2 or, equivalently, 

     odds of receiving a treatment = exp(beta0 + beta2)

Males in Patient Set #2:

  log odds of receiving a treatment = beta0 + beta1 + beta2 + beta3 

      odds of receiving a treatment = exp(beta0 + beta1 + beta2 + beta3)

If you follow the math, you can determine that, among patients in your first set, the odds of receiving the treatment for males are obtained by multiplying the odds of receiving treatment for females with the multiplicative factor below:

f1 = exp(beta0 + beta1)/exp(beta0) = exp(beta1)

Similarly, among patients in your second set, the odds of receiving the treatment for males are obtained by multiplying the odds of receiving treatment for females with the multiplicative factor below:

f2 = exp(beta0 + beta1 + beta2 + beta3)/exp(beta0 + beta2) = exp(beta1 + beta3)

Dividing the two multiplicative factors will yield the quantity you are interested in:

f2/f1 = exp(beta1 + beta3)/exp(beta1) = exp(beta3)

In other words, exponentiating the coefficient of the interaction term in your original binary logistic regression model for the combined data will help you compare the multiplicative factors by which the odds of receiving a treatment for males relative to females differ among the two patient sets.

Related Question