I may be confusing two different concepts in this question, but is it possible to calculate a Cohen's $d$ for a regression coefficient? Could its value represent the standardized mean effect of a one-unit change in the predictor upon the response?
Solved – Cohen’s d for regression coefficient
effect-size
Related Solutions
I think probably the best way to do this is with a multi-group SEM approach. Here's how I would go about it (you can find descriptions of this approach in many intro CFA/SEM texts, like Beaujean, 2014; Brown, 2015; and Little, 2013):
- Given that you are interested in comparing means between groups, I would consider identifying and setting the scale for E using the effects-coding approach (see Little, Slegers, & Card, 2006 for a description). This will ensure that E is scaled on the same metric as its original indicators, which might help with interpretation v. other methods of scale-setting. In a nutshell, the effects-coding method requires you to constrain loadings of E to average 1, and the observed intercepts to average 0.
- Fit a global measurement model of E, and ensure that it fits well by conventional standards (e.g., Hu & Bentler, 1999), since fit will only get worse once you move to multi-group evaluations.
- Using a multi-group approach, test for measurement invariance of E between your gender groups. Specifically, you need to ensure you have evidence of configural (i.e., same general pattern of factors/loadings), weak (i.e., equality of factor loadings) and strong invariance (i.e., equality of observed variable intercepts), in order to make valid inferences about comparisons of latent means (see Vandenberg & Lance, 2000, for a review). Consider both traditional $\chi^2$ difference tests, and evaluating the magnitude of change in $CFI$ as means of conducting invariance-related nested model comparisons (Cheung & Rensvold, 2002).
- If you are just interested in estimating Cohen's $d$, and not evaluating whether the mean difference is significantly different from zero, you can skip this step. But I would also then fit a model constraining latent means to equality between gender groups, in order to test whether they are significantly different.
- Look back at the output from the strong invariance model, in which latent means were allowed to vary between groups. From this model, you now have an estimate of each group's latent mean and variance, which should be sufficient to calculate $d$ in the traditional manner. If you want to save yourself some calculations, you could use phantom variables (see Little, 2013, for a nice description of this technique) to estimate latent standard deviations for you, which you can then plug into your calculations for $d$.
The main perk of this approach, as I see it, is that it allows you to test the measurement invariance assumptions that are implicit in your comparison of group means. The approach you describe above, while generally reasonable, assumes that the construct E "means" the same thing for both genders, without actually testing whether it does or not. To me (though, just my opinion), I don't see the point in using an SEM approach to testing these sorts of hypotheses unless you are going to take full-advantage of the benefits the SEM approach provides you, so I'd strongly advocate for the steps I've described above.
If you're not used to testing measurement invariance, you should check out Beaujean's (2014) book, and the lavaan
and semTools
packages for R, which make evaluating measurement invariance (and latent mean equivalence) between groups a breeze, though doing so with effects-coding requires a bit more coding work.
References
Beaujean, A. A. (2014). Latent Variable Modeling Using R: A Step-by-Step Guide. New York, NY: Routledge.
Brown, T. A. (2015). Confirmatory Factor Analysis for Applied Research (2nd ed.). New York, NY: Guilford Press.
Cheung, G. W., & Rensvold, R. B. (2002). Evaluating Goodness-of-Fit Indexes for Testing Measurement Invariance. Structural Equation Modeling, 9, 233-255.
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1-55.
Little, T. D. (2013). Longitudinal Structural Equation Modeling. New York, NY: Guilford Press.
Little, T. D., Slegers, D. W., & Card, N. A. (2006). A Non-arbitrary Method of Identifying and Scaling Latent Variables in SEM and MACS Models. Structural Equation Modeling, 13, 59-72.
Vandenberg, R. J., & Lance, C. E. (2000). A Review and Synthesis of the Measurement Invariance Literature: Suggestions, Practices, and Recommendations for Organizational Research. Organizational Research Methods, 3, 4-70.
Cohen's $d$ can be thought of as semi-standardized, that is, standardized on the dependent variable but not on the independent variable. Recall that the relationship between $\beta$ (a standardized regression coefficient) and $b$, an standardized regression coefficient is: $$ \beta = \frac{b(s_x)}{s_y} .$$ However, Cohen's $d$ is approximately: $$ d = \frac{b}{s_y} $$ with the difference being that the denominator is the pooled within-groups $s$ and not the overall $s$. Thus, Cohen's $d$ will be bigger than the above, proportional to the size of the effect. In your example, the effect is big enough to make a rather noticeable difference.
If the two groups have an equal sample size, then $s_x$ is 0.5. Thus, we would expect the two estimates to be off by about 1/2 plus the effect of the difference between $s$ and $s_{pooled}$.
Best Answer
Cohen's d does have similarities with the standardised mean effects of an independent variable on the dependent variable in a regression.
Yet, they are completely different.
Why they are similar
You could make the claim that Cohen's d measures the standardized effect of some treatment between a control group (sample 1) and a treatment group (sample 2). You may read treatment very broadly, the 'treatment' may be a difference in time or 'attended college'. If you were to run a regression with the 'treatment' dummy variable as an independent variable, the coefficient on that dummy represents the average different between sample 1 and sample 2. If you then correct the magnitude of the coefficient with its standard error (like you do when you calculate the t-statistic), you have something very similar (the same?) to Cohen's d.
Why they are completely different
As soon as the predictor variable is not a dummy variable, the similarities stop. Trying to extend "Cohen's d-logic" to another kind of predictor variable would be wrong. Even if the predictor variable is a dummy, the inclusion of other predictor variables may cause large disparity between the two measures (Sidenote: the size of the difference depends on the degree of collinearity between the dummy variable and the predictor).
Additionally, the Cohen's d is used as a basis for discussion, whereas regression coefficients are usually used for inference and/or prediction: