I'm not an expert, the following is all just guesswork -- I similarly found the original papers unenlightening wrt their motivation.
As you said, the mystery mainly lies in the motivation of the additional step: modding out the functions from $X^{k+1} \to R$ by the subcomplex of functions which disappear on the neighborhood of the diagonal.
First, let's justify looking at neighborhoods of a space. We know from Alexander duality the philosophy of looking at tautness of a subspace $U$ with respect to a space $Y$.
We look at neighborhood $N$ of $U$ in Y (by neighborhood, we mean a subset $N$ of $Y$ that contains $U$ in its interior). The intersection of two neighborhoods of $U$ in $Y$ will be another neighborhood of $U$ in $Y$, so this gives us a system of groups $\{H^q(N)\}$ where $N$ ranges over all neighborhoods of $U$ in $Y$.
For each $N$, this gives us an inclusion $U \in N$, which induces a homomorphism $H^q(N) \to H^q(U)$. The subspace $U$ is said to be "tautly embedded" in $Y$ if this is an isomorphism for all $q$, all $N$, and all coefficient groups. Being taut implies that $U$ is compact and $Y$ is Hausdorff.
This gives us a hint: we are probably modding out by this subcomplex in order to deal with NON compact Hausdorff spaces.
Second, let's justify looking at the diagonal. The diagonal embedding $X \xrightarrow{\Delta} X \times X$, is simply a canonical way to embed a space X into an ambient space endowed with the product topology, $\Delta X := \{(x,x) \in X \times X\}$. It is useful when want to look in the neighborhood of a space $X$ (e.g., at germs of functions on $X$), but $X$ sits in no ambient space. The word, "diagonal embedding," comes from the example of embedding of $R^1 \hookrightarrow R^2$ taking $x \mapsto (x,x)$, that is, taking the line $R^1$ and embedding it into $R^2$ as the line $y=x$.
With this in mind, let's return our gaze to Alexander-Spanier cochains.
Here's my naive guess: modding out functions which disappear on any neighborhood of $X$, $N(X)$, artifically forces $X$ to satisfy the condition that $$H^q(\text{functions which disappear on }N(X)) \simeq H^q(\text{functions which disappear on }X)$$ for all $N$, all $q$, and all coefficient groups. Perhaps modding out by the subcomplex lets us "falsely" satisfy that $X$ is tautly embedded in $X \times X$, so that we may treat $X$ as if it were a compact space.
Below are a few additional comments toward why someone might have thought of modding out by that particular subcomplex.
Establishing notation: $X^{p+1}$ is the (p+1)-fold product of X with itself, that is, for $x_i \in X$, $(x_1, ..., x_{p+1}) \in X^{p+1}$.
$f^p(X) := \{$ functions $X^{p+1} \to \mathbb{Z} \}$, with functional addition as the group operation.
$f^p_0(X) :=$ elements of $f^p(X)$ which are zero in the neighborhood of the diagonal $\Delta X^{p+1}$
If we are examining functions defined pointwise on $X$, it’s natural to look at $X$-embedded in an ambient space, rather than the space $X$ itself. That is, $N(X)$ is the natural home of the jet bundle of $X$.
Functions which disappear on $N(X)$ form a group. If $f$ and $f’$ are both zero on $N(X)$ then $f-f’$ is zero on $N(X)$.
I'm not sure if the following is useful, nor how it fits into the story, but I figured I'd mention it.
The natural home of jet bundles (over a space $X$) is over the diagonal of X. From reading this paper, it seems that Grothendieck brought to the fore the kth neighborhood of the diagonal of a manifold $X$ when he was porting notions of differential geometry into algebraic geometry (this was then ported back into differential geometry by Spencer, Kumpera, and Malgrange). We'll use the standard notation $\Delta X \subseteq X_{(k)} \subseteq X \times X$. The only points of $X_{(k)}$ are the diagonal points $(x, x)$, but, we equip our space $X_{(k)}$ with a structure sheaf of functions, and treat $X_{(k)}$ as if it is made of "k-neighbor points" (x,y) where x and y are the closest points to one another, what Weil called "points proches").
To picture $X_{(1)}$, we might imagine $X$ with an infinitesimal normal bundle, for $X_{(2)}$, an infinitesimal bundle that’s ever so slightly larger of the second derivatives (as we need more local information to take the 2nd derivative), and so on.
If we think of a function $\omega: X_{(k)} \to R$ which vanishes on $X \subseteq X_{(k)}$ as a “differential k-form,” then maybe:
- the functions which vanish to the first order can be thought of as closed forms, $d\omega = 0$,
- the functions which vanish to the second order on the diagonal $X \subseteq X_{(k+1)}$ can be thought of as exact forms for they satisfy $\omega = d\beta$, s.t. $d(\omega) = d(d\beta) = 0$.
Best Answer
You can indeed make use of the proof from Hatcher Section 2.B. Presumably what that result gives you is trivial reduced homology in all dimensions for $S^n - h(D^k)$.
The way you use it is not to decompose $S^3 - (\text{Im}(f_1) \cup \text{Im}(f_2))$ but instead to decompose $S^3$ itself.
Let $U_1 = S^3 - \text{Im}(f_1)$ and let $U_2 = S^3 - \text{Im}(f_2)$, and so $U_1 \cap U_2 = S^3 - (\text{Im}(f_1) \cup \text{Im}(f_2))$.
So you want to compute the homology of $U_1 \cap U_2$. You can do this using the Mayer-Vietoris sequence for the union $S^3 = U_1 \cup U_2$.
Consider for example this portion of the sequence: $$\underbrace{H_3(U_1)}_{\approx 0} \oplus \underbrace{H_3(U_3)}_{\approx 0} \to \underbrace{H_3(U_1 \cup U_2)}_{H_3(S^3)\approx\mathbb Z} \mapsto H_2(U_1 \cap U_2) \mapsto \underbrace{H_2(U_1)}_{\approx 0} \oplus \underbrace{H_2(U_2)}_{\approx 0} $$
It follows that $H_2(U_1 \cap U_2) \approx \mathbb Z$. Furthermore, the proof of the Mayer Vietoris sequence is sufficiently explicit that you should be able to use it to produce an explicit 2-cycle representing the generator of $H_2(U_1 \cap U_2)$. The word "explicit" must be taken with a grain of salt, of course, because you are not being given the maps $f_1$ and $f_2$ explicitly. Really what you will do is to produce a formula for that 2-cycle that is expressed in terms of $f_1$ and $f_2$.
A similar method for $H_1(U_1 \cap U_2)$ works as well, with an even simpler outcome. It also works for $H_0(U_1 \cap U_2)$ but make sure to use reduced homology to save yourself some headaches.