If you have exactly as many instrumental variables as endogenous regressors, then there is no way to test for IV validity in a homogenous effects model.
Consider, for example the following model:
$$
Y = \alpha + \beta D + U
$$
This is a homogeneous effects model: the treatment effect is a constant $\beta$ that is the same for everyone. The two IV assumptions are relevance and exogeneity. Relevance requires that $\text{Cov}(Z,D) \neq 0$. This is directly testable. Exogeneity requires that $\text{Cov}(Z,U) = 0$. This cannot be tested. To see why suppose that $Z$ is in fact an endogenous instrument, i.e. that
Suppose that $Z$ is in fact an invalid instrument, i.e. that $\text{Cov}(Z,U) \neq 0$.
In this case the IV estimand is still perfectly well-defined, it simply doesn't equal $\beta$:
$$
\beta_{IV} = \frac{\text{Cov}(Z,Y)}{\text{Cov}(Z,D)} = \beta + \frac{\text{Cov}(Z,U)}{\text{Cov}(Z,D)}, \quad
\alpha_{IV} = \mathbb{E}(Y) - \beta_{IV} \mathbb{E}(D).
$$
Now, let $V$ be the IV residual: $V \equiv Y - \alpha_{IV} - \beta_{IV} D$.
Note that $V$ is only equal to $U$ if $Z$ is a valid instrument, because this is the only way that we can have $\beta_{IV} = \beta$ and $\alpha_{IV} = \alpha$.
Using our definition of $V$, we can calculate $\text{Cov}(Z,V)$ as follows:
\begin{align*}
\text{Cov}(Z,V) &= \text{Cov}(Z, Y - \alpha_{IV} - \beta_{IV} D) = \text{Cov}(Z,Y) - \beta_{IV} \text{Cov}(Z,D) \\
&= \text{Cov}(Z,Y) - \frac{\text{Cov}(Z,Y)}{\text{Cov}(Z,D)} \text{Cov}(Z,D) = 0.
\end{align*}
In other words, $Z$ is always perfectly uncorrelated with the IV residual $V$ by construction, regardless of whether $Z$ is correlated with the structural error $U$.
A Durbin-Hausman-Wu test checks whether the OLS and IV estimands are the same. This does not tell us whether the instrument is invalid.
When there are more instruments than endogenous regressors, an overidentifying restrictions test can be used to test the null hypothesis that both instruments are valid. The intuition is as follows. Continue to assume that $Y = \alpha + \beta D + U$ but suppose now that we have two relevant instruments $Z_1$ and $Z_2$, i.e. $\text{Cov}(Z_1, D) \neq 0$ and $\text{Cov}(Z_2,D)\neq 0$. Define two IV estimands: one that uses $Z_1$ to instrument for $D$ and another that uses $Z_2$, namely
$$
\beta_{IV}^{(1)} \equiv \frac{\text{Cov}(Z_1,Y)}{\text{Cov}(Z_1,D)} = \beta + \frac{\text{Cov}(Z_1,U)}{\text{Cov}(Z_1,D)}
$$
and
$$
\beta_{IV}^{(2)} \equiv \frac{\text{Cov}(Z_2,Y)}{\text{Cov}(Z_2,D)} = \beta + \frac{\text{Cov}(Z_2,U)}{\text{Cov}(Z_2,D)}.
$$
Taking differences of the two estimands, we obtain
$$
\beta_{IV}^{(1)} - \beta_{IV}^{(2)} = \frac{\text{Cov}(Z_1,U)}{\text{Cov}(Z_1,D)} - \frac{\text{Cov}(Z_2,U)}{\text{Cov}(Z_2,D)}.
$$
If both $Z_1$ and $Z_2$ are valid instruments, then $\text{Cov}(Z_1,U) = \text{Cov}(Z_2,U) = 0$ which implies $\beta_{IV}^{(1)} - \beta_{IV}^{(2)} = 0$.
Therefore, if $\beta_{IV}^{(1)}$ and $\beta_{IV}^{(2)}$ differ then at least one of the instruments $(Z_1,Z_2)$ must be invalid.
While it is formulated in a slightly different way, a test of overidentifying restrictions exploits this basic intuition to provide a test of the joint null hypothesis that both instruments are valid: $\text{Cov}(Z_1,U) = \text{Cov}(Z_2,U) = 0$.
While this example concerns two instruments in a model with a single endogenous regressor, the same idea applies whenever there are more instruments than endogenous regressors.
In a model with heterogeneous treatment effects, the equivalent of instrument exogeneity does have testable implications even if there are as many endogenous regressors as instruments. See the following references for details:
- Huber, Martin, and Giovanni Mellace. "Testing instrument validity for
LATE identification based on inequality moment constraints." Review
of Economics and Statistics 97.2 (2015): 398-411.
- Mourifié, Ismael, and Yuanyuan Wan. "Testing local average treatment effect assumptions." Review of Economics and Statistics 99.2 (2017): 305-313.
- Kitagawa, Toru. "A test for instrument validity." Econometrica 83.5 (2015): 2043-2063.
Best Answer
Random variables $\{u_i; i=1,...,n\}$ are said to be "homoskedastic" when
$$\text{Var}(u_i) = \text{constant},\;\; \forall i$$
This property can coexist with conditional heteroskedasticity:
$$\text{Var}(u_i \mid \mathbf x_i) = h(\mathbf x_i)$$
This is because, by the Law of Total Variance, we have
$$\text{Var}(u_i) = E\big[\text{Var}(u_i \mid \mathbf x_i)] + \text{Var}\big[E(u_i\mid \mathbf x_i)\big]$$
$$= E[h(\mathbf x_i)]+\text{Var}\big[E(u_i\mid \mathbf x_i)\big]$$
The second term is a moment of the distribution of the random variable $E(u_i\mid \mathbf x_i)$, and so a constant (irrespective of whether $E(u_i\mid \mathbf x_i)=0$ or not) . The first term will be a constant over $i$ if the $\mathbf x_i$'s have equal mean over $i$.
In other words if the corresponding collection of regressors are "first-order stationary", then we can have conditional heteroskedasticity and unconditional homoskedasticity at the same time.
As regards the relation/contrast between "homoskedasticity" and "exogeneity", first of all, in fairness to the textbook you mention they actually write in page 194
So they do point out that the more accurate term for the property is the "orthogonality" one, while the concept of "exogeneity" has variations in "weak", "strong" or "strict" each reflecting a different assumption.
Now, as regards the essence of the question: conditional homoskedasticity states that
$$\text{Var}(u_i \mid \mathbf x_i) = E(u_i^2\mid \mathbf x_i) - \left[E(u_i\mid \mathbf x_i)\right]^2 = \text{constant}$$
So it is a statement about whether moments of the distribution followed by the $u_i$'s are affected by the presence of $\mathbf x_i$ (or, in an informal informational approach, whether if we know $\mathbf x_i$, the variation that we anticipate to see in $u_i$ as summarized by the variance changes compared to when we don't know $\mathbf x_i$). Keep that "variation" relates to second moments.
On the other hand, the orthogonality property states that $$E(\mathbf x_i\cdot u_i)=0$$
This is a statement about the first moment of a specific function of $\mathbf x_i$ and $u_i$, namely, their product. In a regression setting, where we assume that $E(u_i)=0$, we have that
$$E(\mathbf x_i\cdot u_i)=0 \implies \text{Cov}(\mathbf x_i, u_i)=0$$
So this is a property about whether $\mathbf x_i$ and $u_i$ tend to co-vary.
So, in general informal terms, conditional homoskedasticity and orthogonality, both state that "the $\mathbf x_i$'s do not tell us something about the $u_i$'s" - but this "something" is a different "something" in each case, and usefully and meaningfully distinguished.