Solved – can a continuous variable (age) be used as a predictor if there is an age gap in the data

lme4-nlmemultiple regressionregression

I am working with a previously collected data-set where subjects were recruited based on them being adults or adolescents. The age range is 11-16 in the adolescents and 20-27 in the adults.

Since I didn't collect this data, I don't have a choice in how the ages are distributed. However, I don't think it makes much sense to view age as a category here. The subject matter concerns functional MRI data. In my opinion there is a probable difference between an 11 year old and a 16 year old, and between a 20 year old and a 27 year old.

Would it be legitimate to treat age as a continuous predictor given the four year gap between the adolescents and adults? If not, might representing the continuous effect of age on the outcome variable (brain connectivity measures) within the adolescent and adult groups be feasible? I'm not sure how that would be best represented… Age Group:Age as a fixed effect or perhaps (1|Age Group/Age) as a nested random effect? (lme4 syntax)

Best Answer

Yes, you can regress both age groups simultaneously. However, the interpretation of such a regression should be done with caution. For example, this will augment the apparent correlation between age and whatever is on the y-axis compared to what it would be with all of the ages included. What differences this would make for correlation, significance of parameters and regression could be estimated, for example, by multiple Monte Carlo simulations of appropriately adjusted synthetic data with a completed age range.

Related Question