Solved – Multiple linear regression with low sample size

regressionsmall-sample

I have a dataset with 12 observations for a particular response variable. In this case the response variable is related with genetic diversity parameters. I want to perform a multiple linear regression using 3 predictors. The assumptions of independence, normality of residuals and homogeneity of variances were not violated. However, I am aware that my sample size is most likely not suitable to perform this analysis (I am following the rule of thumb 10 observations per 1 predictor).

I thought that building a set of nested models comprising all combinations of one or two variables would minimize this issue. However, I am not quite confident regarding the implications in my results of using this procedure because effect sizes and model Akaike weights are no longer comparable, right? Can you please help me with this issue?

Also, do you know other relevant methodological alternatives to apply under the described conditions (i.e. low sample size and 3 predictors) or any relevant references about this subject?

Thank you very much for your time.

Best Answer

If your scientific question is about all three predictors and you have no possibility of collecting more data then you just go ahead. It is not ideal of course as your estimates will be rather imprecise but I imagine you already know that. Doing some form of variable selection may be mis-leading since it tends to be data driven and so may have been unduly affected by some minor quirk in your data-set and so not generalise well to other situations .