Your interpretation of the model’s coefficients is not completely accurate. Let me first summarize the terms of the model.
Categorial variables (factors): $race$, $sex$, and $educa$
The factor race
has four levels: $race = \{white, black, mexican, multi/other\}$.
The factor sex
has two levels: $sex = \{male, female\}$.
The factor educa
has five levels: $educa = \{1, 2, 3, 4, 5\}$.
By default, R uses treatment contrasts for categorical variables. In these contrasts, the first value of the factor is used a reference level and the remaining values are tested against the reference. The maximum number of contrasts for a categorical variable equals the number of levels minus one.
The contrasts for race
allow testing the following differences:
$race = black\ vs. race = white$, $race = mexican\ vs. race = white$, and $race = multi/other\ vs. race = white$.
For the factor $educa$, the reference level is $1$, the pattern of contrasts is analogous.
These effects can be interpreted as the difference in the dependent variable. In your example, the mean value of cog
is $13.8266$ units higher for $educa = 2$ compared to $educa = 1$ (as.factor(educa)2
).
One important note: If treatment contrasts for a categorical variable are present in a model, the estimation of further effects is based on the reference level of the categorical variable if interactions between further effects and the categorical variable are included too. If the variable is not part of an interaction, its coefficient corresponds to the average of the the individual slopes of subsets of this varible along all remaining categorical variables. The effects of $race$ and $educa$ correspond to average effects with respect to the factor levels of the other variables. To test overall effects of $race$, you would need to leave $educa$ and $sex$ out of the model.
Numeric variables: $lg\_hag$ and $pdg$
Both lg_hag
and pdg
are numeric variables hence the coefficients represent the change in the dependent variable associated with an increase of $1$ in the predictor.
In principle, the interpretation of these effects is straightforward. But note that if interations are present, the estimation of the coefficients is based on the references categories of the factors (if treatment contrasts are employed). Since $pdg$ is not part of an interaction, its coefficient corrsespods to the average slope of the variable with respect. The variable $lg\_hag$ is also part of an interaction with $educa$. Therefore, its effect holds for $educa = 1$, the base level.; it is not a test of an overall influence of the numeric variable $lg\_hag$ irrespective of the levels of the factors.
Interactions between categorical and numeric variables: $lg\_hag \times educa$
The model does not only include main effects but also interactions between the numeric variable $lg\_hag$ and the four contrasts associated with $educa$. These effects can be interpreted as the difference in the slopes of $lg\_hag$ between a certain level of $educa$ and the reference level ($educa = 1$).
For example, the coefficient of lg_hag:as.factor(educa)2
(-21.2224
) means that slope of $lg\_hag$ is $21.2224$ units lower for $educa = 2$ compared to $educa = 1$.
In the scenario you describe least squares regression will allow you to tell a very straightforward story:
First of all, imagine that you have no dichotomous independent variable. So:
(1) $y_{i} = \beta_{0} + \beta_{1}x_{1i} + \varepsilon_{i}$
Your regression describes the relationship between your dependent variable $y$ and your continuous independent variable $x_{1}$ as a straight line, with intercept $\beta_{0}$ and slope $\beta_{1}$. Cool? Cool.
Now add both the dichotomous independent variable $x_{2}$ and the interaction between $x_{1}$ and $x_{2}$ to the model:
(2) $y_{i} = \beta_{0} + \beta_{1}x_{1i} + \beta_{2}x_{2i} + \beta_{3}x_{1i}x_{2i} + \varepsilon_{i}$
So now what is your model telling you? Well, (assuming $x_{2}$ is coded 0/1) when $x_{2} = 0$, then the model reduces to equation (1) because $\beta_{2} \times 0 = 0$ and $\beta_{3} \times x_{1} \times 0 = 0$. So that is easy-peasy puddin' pie.
What about when $x_{2} =1$? Well now the $y$-intercept is $\beta_{0} + \beta_{2}$ (Right? Because $\beta_{2} \times 1 = \beta_{2}$).
And the slope of the line relating $y$ to $x_{1}$ is now $\beta_{1} + \beta_{3}$ (Right? Because $\beta_{1}\times x_{1} + \beta_{3} \times x_{1} \times 1 = \beta_{1}\times x_{1} + \beta_{3} \times x_{1} = (\beta_{1} + \beta_{3})\times x_{1}$).
So when $x_{2}=1$ you simply have a second regression line relating $y$ to $x_{1}$, with a different intercept (if $\beta_{2} \ne 0$) and a different slope (if $\beta_{3} \ne 0$ which will be true if you tested a significant interaction term in, say, ANOVA).
How do you communicate this? A single graph with two regression lines overlaying your data (possibly with different colored/shaped/sized markers when $x_{2}=1$), and a label indicating which line corresponds to $x_{2}=0$ and $x_{2}=1$. Also providing your audience with the values of the $\beta$s and their standard errors and/or confidence intervals is good (like, in a table of multiple regression results).
Cool? Cool.
Finally, while all the above tells you about trend relationships between $y$ and $x_{1}$ given $x_{2}$, least squares regression also tells you about strength of association. If you had a single independent variable, you'd probably want to use something like $R^{2}$ to describe this strength of association, but when you add variables $R^{2}$ doesn't quite mean what it did before. So you might use generalized $R^{2}$, or Pseudo-$R^{2}$ or some such.
Best Answer
As @gung said, it would help if you gave your full equation and DV, but, here, if the interaction between sex-female and mobility is -10.1, it means that the effect of high mobility on the dependent variable is 10.1 units less for women then men. Similarly, the effect of being female on the DV is 10.1 units less for high mobility people than for low.
For continuous variables, it is much the same, except that it is per unit of the other IV. So, 1.3 for the interaction between weight and IQ means that the effect of IQ on the DV is 1.3 units higher for each increase of one unit (pound? kilogram?) of weight, and the effect of weight is 1.3 units higher for each increase of 1 point in IQ. In other words, the effect is more positive for people who are both smart and heavy than for people who are one or the other.