I'm trying to estimate power in a logistic regression with a continuous exposure in a cohort study (ie, the ratio of the sampling probabilities is 1). I have population cumulative incidence (probability) and population exposure variability and exposure mean and an expected odds ratio. I also have a total sample size.

I'm using R and it seems like `Hmisc::bpower`

is only for logistic regression with binary exposure and I can't seem to find any packages that estimate binomial power with continuous exposure.

I've attempted the following simulation but it's quite slow given my total sample size and I'm not sure if it's right:

```
p <- vector()
betahat <- vector()
for(i in 1:1000){
n <- 40000 #total sample size
intercept = log(0.008662265) #where exp(intercept) = P(D=1)
beta <- log(1.4) #where exp(beta)=OR corresponding to a one unit change in xtest
xtest <- rnorm(n,1.2,.31) #xtest is vector length 40,000 with mean 1.2 and sd .31
linpred <- intercept + xtest*beta #linear predictor
prob <- exp(linpred)/(1 + exp(linpred)) #link function
runis <- runif(n,0,1) #generate a vector length n from a uniform distribution 0,1
ytest <- ifelse(runis < prob,1,0) #if a random value from a uniform distribution 0,1 is less than prob, then the outcome is 1. otherwise the outcome is 0
coefs <- coef(summary(glm(ytest~xtest, family="binomial"))) #run a logistic regression
p[i] <- coefs[2,4] #store the p value
betahat[i] <- coefs[2,1] #store the unexponentiated betahat
}
mean(p < .05)
#power
exp(mean(betahat))
#sanity check, should equal 1.4--it does
```

Is there anything wrong with this approach?

One concern of mine is that the cumulative incidence (ie, probability of event over the given time period) comes from a population that did not have 0 exposure. In fact, it's reasonable to assume that the value i'm using for an intercept is actually from a population that has an exposure variability similar to mine. In that case, how would I estimate the unexposed probability given an odds ratio (and other information that I would find in say, a published paper) to use in my power calculation?

## Best Answer

I'm skirting around the question on your simulation set up, and addressing the wider question of determining sample size in this scenario.

You could turn the question on its head (as I understand your question -- a binomial outcome, one continuous predictor) and then determine the power to detect a difference in means of the continuous predictor between the two groups on your binary variable. This is very easy to calculate. From a hypothesis testing perspective, these two analyses should give identical p-values, and so power for the two scenarios should also be equivalent.

see also some of the discussion on Choosing between logistic regression and Mann Whitney/t-tests for details on this.

More detailed consideration is in the following article on sample size considerations for logistic regression by Hsieh et al. (pubmed: http://www.ncbi.nlm.nih.gov/pubmed/9699234; full paper available at http://personal.health.usf.edu/ywu/logistic.pdf)