In R the effects
package can easily help with interpreting such coefficients by producing the appropriate graphs. From CRAN:
effects: Effect Displays for Linear, Generalized Linear, and Other Models
Graphical and tabular effect displays, e.g., of interactions, for various statistical models with linear predictors.
The package comes with two papers by John Fox describing the use of the package (along with examples) and the underlying implementation. The package takes good care of dealing with the subtle details and implementation difficulties for the supported models.
As an example:
require(car)
require(effects)
data(Cowles)
cowles.mod <- glm(volunteer ~ sex + neuroticism*extraversion,
data=Cowles, family=binomial)
Anova(cowles.mod)
plot(effect("neuroticism*extraversion", cowles.mod))
plot(effect("sex", cowles.mod)) ##for dummy, but should be similar for factors
Which will yield:
> Anova(cowles.mod)
Analysis of Deviance Table (Type II tests)
Response: volunteer
LR Chisq Df Pr(>Chisq)
sex 4.9184 1 0.026572 *
neuroticism 0.3139 1 0.575316
extraversion 22.1372 1 2.538e-06 ***
neuroticism:extraversion 8.6213 1 0.003323 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
And for the graph, which displays the predicted values of Y on the vertical axis, and the effect of X ("neuroticism") on Y given various levels of Z ("extraversion"):
Here's the effect plot for the dummy (but it should work similarly for factors):
Lastly, there seems to be some confusion surrounding partial effects (holding other predictors constant) vs marginal effects (ignoring other predictors). In my understanding the effects
package is concerned with displaying "partial effects".
The question is "minimum number of observations to do what?". If the objective is to find the min number of observations needed to detect a significant effect of a dummy (when the effect truly exists), then you need to know what might be the effect size and then perform standard power analysis. If you just want to know how many observations you need to run your model, then it is very likely that the model will run with a very small number of observations per variables ( say less than 10) as long as they are not too correlated.
Best Answer
It is easier to think about interpreting your dichotomous predictors by using the concept of the odds ratio.
Let me give you an example: Imagine you are trying to predict smoking status where our smoking variable is a 1 if you smoke and and 0 if you don't smoke (so a dichotomous outcome and so we can use logistic regression). Now, as in your case, imagine that you have a predictor variable called white where the variable is 1 if you are white or 0 if you are not white. In this example, you can fit a logistic regression model that looks something like this:
$$\text{logit}(p)=\beta_0+\beta_1\times \text{white}$$
And now, lets assume that you get an estimate of $\beta_1=-0.5108256$. Now, converting the estimate onto the odds ratio scale is as simple as exponentiating the parameter estimate, i.e, on the odds ratio scale we have $$e^{\beta_1}=e^{-0.5108256}=0.6$$. And so finally what this tells us is that if you are white, you are expected to be 60% less likely to be a smoker as compared to someone who is not white.
And so to answer your direct question, you wouldn't say that "a 1% increase in being white affect your probability of the dependent variable by x amount", but rather that, you are "y" times more likely to observe the dependent variable given that you are white as compared to not being white.