Let me explain what linearity means with nominal/dummy variables. In essence, it means there is no interaction term between your independent variables that you have left out.†
Suppose we have two nominal variables $x_0$ and $x_1$, each taking values 0 or 1, and a response variable $y$. (The general case is similar.)
If we model $y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \epsilon$:
$\beta_0$ is the expected response when $x_1 = x_2 = 0$
$\beta_0 + \beta_1$ is the expected response when $x_1 = 1, x_2 = 0$
$\beta_0 + \beta_2$ is the expected response when $x_1 = 0, x_2 = 1$
$\beta_0 + \beta_1 + \beta_2$ is the expected response when $x_1 = x_2 = 1$
There's a relationship here, since we have 3 coefficients but four cases: The last minus the first is the sum of the second minus the first and the third minus the first.
If this relationship actually holds in your situation between the expected responses, then this linear model can be a good one. If not, then the failure of this relationship is a type of nonlinearity.
If we include an interaction term, then linearity is automatically satisfied, because we have four coefficients to fit the four cases. That is, with a model $y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_3 x_1 x_2 + \epsilon$ there is no restriction on the relationship between the expected responses in the four cases above. (However the distributions of y in these 4 cases may still be different, which would violate the model as written.)
How do you test whether you can leave out the interaction term? One way would be to try including it and test whether the coefficient $\beta_3$ is distinct from zero. For example, in the case of normal error $\epsilon$, this would be a $t$-test for a slope coefficient in a regression.
† An interaction between $x_1$ & $x_2$ is a type of (multi-dimensional) nonlinearity: there's no possibility of a nonlinear relationship between $\operatorname{E}Y$ and $x_1$ when $x_1$ is a dummy variable, but there is between $\operatorname{E}Y$ and $(x_1,x_2)$. That is, there may be no plane passing through the four points $(0,0,\operatorname{E}(Y|\,0,0))$, $(1,0,\operatorname{\operatorname{E}}(Y|\,1,0))$, $(0,1,\operatorname{E}(Y|\,0,1))$, $(1,1,\operatorname{E}(Y|\,1,1))$.
For dummy variables, these interaction terms are the only potential source of nonlinearity of the expected responses.
Best Answer
To add to AdamO's answer, I was taught to base my decisions regarding model assumptions more on whether failing to correct the assumption in some way causes me to misrepresent my data. For a concrete example of what I mean, I simulated some data in
R
and created some plots and ran some diagnostics using these data.However, when plotting the data, it's clear that the curvilinear component is an important aspect of the relationship between x and y.
A diagnostic test of linearity also supports our argument that the quadratic component is an important aspect of the relationship between x and y for these data.
Let's see what happens when we simulate data with a smaller (but still significant) nonlinear trend.
If we examine a plot of these new data, it's pretty clear that they are well-represented by just the linear trend.
This is in spite of the fact that this model fails a diagnostic test of linearity.
My point is that diagnostic tests should not be a substitute for thinking on the part of the analyst; they are tools to help you understand whether your substantive conclusions follow from your analyses. For this reason, I prefer to look at different types of plots rather than rely on global tests when I'm making these sorts of decisions.