Solved – Why is power analysis with logistic regression so liberal compared to chi squared

logisticrstatistical-power

I'm conducting a power analysis to derive the required sample size for a study – basically compared exposed / non-exposed with 30-day mortality as outcome. I'll check for crude mortality rates with chi-square, but also use logistic regression with probable confounders.

When I run a power analysis – power 0.8, significance level 0.05, effect size 0.15 and estimated 10 confounders I get that I'd need only n=117 which seem quite small.
comparing with chi-square – it suggest that I'd need 350.

I'm using R and pwr:

pwr.f2.test(u=10, v=NULL, f2=0.15, sig.level=0.05, power=0.8)
pwr.chisq.test(w=0.15, N=NULL, df=1 , sig.level=0.05, power=0.8 )

Is this predictable or am I misusing this?

Best Answer

The two tests (logistic regression and chi-square) are equivalent and a power analysis should give the same answer.

You are assuming that a value of 0.15 for f2 and w are the same effect size, they're not. A small value of w is 0.1, a small value of f2 is 0.02.

cohen.ES(test=c("chisq"), size=c("small"))
cohen.ES(test=c("f2"),    size=c("small"))

Edit: Elaborated on the similarity of the two approaches.

IF you give the same data to logistic regression and a chi-square test (strictly: without Yates' correction), you get the same result. Here's an example

> set.seed(1234)
> x <- rbinom(100, 1, 0.2) 
> y <- rbinom(100, 1, 0.2) 
> chisq.test(table(x, y), correct=FALSE)

    Pearson's Chi-squared test #'

data:  table(x, y)
X-squared = 0.155, df = 1, p-value = **0.694**

Warning message:
In chisq.test(table(x, y), correct = FALSE) :
  Chi-squared approximation may be incorrect
> summary(glm(y ~ x, family="binomial"))

Call:
glm(formula = y ~ x, family = "binomial")

Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-0.753  -0.753  -0.753  -0.668   1.794  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)   -1.114      0.251   -4.43  9.4e-06 ***
x             -0.272      0.693   -0.39     **0.69**    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 110.22  on 99  degrees of freedom
Residual deviance: 110.06  on 98  degrees of freedom
AIC: 114.1

Number of Fisher Scoring iterations: 4

The p-values are the same, so the power should be the same. I can't remember the formulas for the two different versions of the effect size. Effect size measures are a little weird because in the old days you wanted to minimize the number of tables that you put into books (so we have, for example, $f^2$ instead of $R^2$, when there's a direct relationship between them, and $R^2$ is what everyone understands).

Related Question