Regression Analysis – Can the McNemar Test be Replaced with Regression?

mcnemar-testregression

Would it be possible to replace the McNemar test with regression? What would be the regression formula? What type of regression should be used (logistic?, some survival?)?

Let's have a made-up example with one/matched/paired study population. In the example, the pre-intervention male proportion is 50% and that dropped to 40% after the intervention. Thus, there was a drop of 10 percentage points.

enter image description here

Woult it be correct to use logistic regression with logit-link for analysis? Y variable would be "Sex-Changed" and intercept-only model will be run?

Sex-Changed ~ 1

Best Answer

The following example will work to model with logistic regression a test similar to McNemar's test on a 2 x 2 table.

Essentially the dependent variable for a binomial logistic model takes on one value if the observation changed in one direction, and another value if the observation changed in the other direction. Concordant observations are ignored, like in McNemar's test.

At the time of writing, I don't know how to extend this model to tables larger than 2 x 2.

Matrix =as.matrix(read.table(header=TRUE, row.names=1, text="
Before  Yes   No
Yes      9     5
No      17    15
"))

Matrix

mcnemar.test(Matrix)

   ### McNemar's Chi-squared test with continuity correction
   ### McNemar's chi-squared = 5.5, df = 1, p-value = 0.01902

binom.test(17, (17+5))

   ### Exact binomial test
   ### number of successes = 17, number of trials = 22, p-value = 0.0169


### Code to convert matrix to long format adapted from
###  https://rcompanion.org/handbook/H_01.html

Counts = as.data.frame(as.table(Matrix))

colnames(Counts) = c("Before", "After", "Freq")

Long = Counts[rep(row.names(Counts), Counts$Freq), c("Before", "After")]

rownames(Long) = seq(1:nrow(Long))

### Create a new variable with values -1 for change in one direction,
###  0 for no change, and 1 for change in the other direction.

Long$ChangeSign = as.numeric(Long$Before) - as.numeric(Long$After)

Long$ChangeSign = factor(Long$ChangeSign)

Long

### Create a new data frame with only those observations where
###   the result changed.

Long2 = Long[Long$ChangeSign!=0,]

Long2

   ###    Before After ChangeSign
   ### <snip>
   ### 23     No   Yes          1
   ### 24     No   Yes          1
   ### 25     No   Yes          1
   ### 26     No   Yes          1
   ### 27    Yes    No         -1
   ### 28    Yes    No         -1
   ### 29    Yes    No         -1
   ### 30    Yes    No         -1
   ### <snip>

model = glm(ChangeSign ~ 1, data = Long2, family = binomial)

summary(model)

### Coefficients:
###   Estimate Std. Error z value Pr(>|z|)  
### (Intercept)  -1.2238     0.5087  -2.405   0.0162 *