Regression is equivalent to ANCOVA, you are right about that. So, if there is a point to using one there is a point to using the other. The typical format of the output varies, that's all.
For a dependent variable that is ordered, a good starting place is ordinal logistic regression.
Whether to use regression with the initial time point as a covariate or to use something like a multilevel model is another question; it has been discussed here in the past.
Time itself and time-dependent variables are not commonly used as input predictors in logistic regression, which has (for binary logistic) one outcome variable (y/n, 0/1) and all the baseline variable values as predictors along with treatment (0-placebo,1-treated). Cox PH regression can use the same baseline and treatment variable values, but there are two outcome variables for Cox PH regression: time-to-event (e.g., days), and failure(0,1) or "censoring".
If multiple records existed in this study (maybe for follow-up clinic visits) for each subject, then Cox PH can be employed using time-dependent variables (such as multiple LDL
or SBP
values taken each visit), but logistic doesn't allow for such.
In short, logistic is more for prevalence modeling when the outcome is y/n, and there is no time involved. That is, was there e.g. recurrence (y/n) over the entire follow-up period? On the other hand, Cox PH regression is for time-to-event modeling and requires the time-to-event for each patient, and the failure status at the time of the event (e.g. time=200 days, failure=1), withdrawal from the study (e.g. time=50 days, failure=0 since you know they didn't fail when they withdrew), or last known clinic visit (e.g. time=200 days, failure=0) for subjects who never failed.
If you want to use incidence rates of disease (#new cases
/person-years
) for sub-populations in a study partitioned by categorical values, i.e., the "density method," then Poisson regression would be used for incidence modeling.
In clinical trials, however, it's commonly assumed that withdrawals are failures, so you assign failure=1 and time to the #days from consent up to the time they withdrew.
For longitudinal modeling with logistic regression, it's possible that generalized linear models (GLM) or generalized estimating equations (GEE) was used, in which a logistic "link" was employed with clustering on each subject ID. (There is not a Cox PH link function for GLM or GEE). GLM/GEE can accomodate a number of link functions such as linear(Gaussian), logistic, Poisson, and can simultaneously use in one model:
- outcome variable (linear link): repeated measurement outcome (
LDL
at each clinic
visit)
- baseline predictors:
female(0,1)
, DM(0,1)
, history of stroke(0,1)
, history of CKD(0,1)
.
- time-dependent predictors:
SBP
, glucose
, etc. during each
clinic visit
- treatment predictor:
treatment(0-placebo, 1-drug)
- time predictor:
time (#days up to each clinic visit)
- time-treatment interaction: time(e.g. days) $\times$ treatment(0,1)
This is called longitudinal modeling, or panel data modeling -- which is much more complex than what's taught in grad-level foundations or intermediate biostat courses. So their analysis is either what I described at first, or much more complex than considered for a beginners perspective. One last point about GLM/GEE, when time and treatment are in the model, the effect of treatment on the outcome has to be determined using the interaction between time and treatment, i.e., timetrt
= time
$\times$ treatment
, which is a new variable that has to be generated by multiplying time
by treatment (0,1)
. If LDL is the outcome, with repeated measurement values at each clinic visit, the regression coefficient for the interaction term timetrt
, $\beta_{timetrt}$, and its p-value will reveal whether or not the slopes of the within-subject LDL values (i.e., outcome) were different between placebo and treated. In other words, when adjusting for baseline covariates, time-dependent covariates, a main effect of treatment, and a main effect of time, did the treatment result in significantly different slopes for LDL change over time?
Best Answer
You can, and indeed should adjust for confounding variables in a non-experimental study like the one you're describing.
Some relevant questions and answers on this site: How exactly does one “control for other variables”? and Adjusting for Confounding Variables . You may simply be able to stratify your data, based on what confounders you think are important and whether they're categorical or continuous variables, but in all likelihood you need to be looking to a regression-based approach to account for the differences between your groups that are not due to your exposure of interest.
Giving advice on your specific study is beyond the scope of this site, for the most part, and definitely can't be answered in a single question with the amount of information you have provided. My recommendation would be to consult with a statistician or experienced researcher at your institution to see if they can provide some guidance to you.