Solved – Multiple Regression – effect size

effect-sizemultiple regression

I want to calculate the effect size of a multiple regression. So far, I used Cohens $f^2$.

Somehow, I wonder, if the effect size expressed by Cohens $f^2$ is an appropriate measurement. Let's assume that $R^2$ is high valued and furthermore strongly biased – $f^2$ will still give me a high effect which is not taking in account how many predictors I included in model and what's the quality of model ("no punishment" for a bad model).

Is there a better approach ("state of the art") to calculate effect size for multiple regression analysis?

Can I calculate the effect size Cohen's $f^2$ of each predictor by using $R^2$ adjusted? Does this make any sense?

Best Answer

Your question seems to express the issue of bias mentioned in the Wikipedia article on effect sizes. It seems like you are concerned that you regression model will be over parametrized, which $R^2$ does not take into account - as you note, it just gets higher the more parameters you put into it. Using adjusted $R^2$ would seem to be ok to addresss this problem.

This feature of $R^2$ has been highlighted as a reason not to use $R^2$, normal or adjusted, as part of a model selection process. A possible solution would be to do model selection using an information criteria such as AIC or BIC to arrive at an appropriate regression model from which to take $R^2$ and calculate $f^2$.

Cohen's $f^2$ does not appear to be used that often, but has been recently recommended for use with mixed effects (aka hierarchical or multilevel) multiple regression By Selya et al. (2012): "A practical guide to calculating Cohen's $f^2$, a measure of local effect size, from PROC MIXED".

Reporting the effect size for particular parameters instead of just the whole model might be one general solution to the bias issue you raise. Selya et al. use Cohen's $f^2$ for comparing the effect sizes of different predictors within their model. They specify $f^2$ as

$$ f^2= \frac{{R_{AB}^2} - {R_{A}^2}}{1-{R_A^2}} $$

Where ${R_{AB}^2}$ is a multiple regression model with all of their predictors, and ${R_{B}^2}$ is a model without the predictor (A) for which they want to calculate a "local" effect size. ${R_{AB}^2}$ and ${R_{B}^2}$ can be respecified for different focal parameters for comparing several different effect sizes from the same model. If your are worried that $R^2$ is inflated then this approach should equally penalize all calculations of the effect sizes for individual parameters.

A final thing that comes to mind is that you could calculate a confidence region around $R^2$ and even $f^2$. This can be done via bootstrapping; there might be packages in R that can do this automatically. I'm not sure if this is actually relevant to the general issue of bias in $R^2$ but it could at least express some degree of uncertainty in how how $R^2$ is.