I'm not an expert, the following is all just guesswork -- I similarly found the original papers unenlightening wrt their motivation.
As you said, the mystery mainly lies in the motivation of the additional step: modding out the functions from $X^{k+1} \to R$ by the subcomplex of functions which disappear on the neighborhood of the diagonal.
First, let's justify looking at neighborhoods of a space. We know from Alexander duality the philosophy of looking at tautness of a subspace $U$ with respect to a space $Y$.
We look at neighborhood $N$ of $U$ in Y (by neighborhood, we mean a subset $N$ of $Y$ that contains $U$ in its interior). The intersection of two neighborhoods of $U$ in $Y$ will be another neighborhood of $U$ in $Y$, so this gives us a system of groups $\{H^q(N)\}$ where $N$ ranges over all neighborhoods of $U$ in $Y$.
For each $N$, this gives us an inclusion $U \in N$, which induces a homomorphism $H^q(N) \to H^q(U)$. The subspace $U$ is said to be "tautly embedded" in $Y$ if this is an isomorphism for all $q$, all $N$, and all coefficient groups. Being taut implies that $U$ is compact and $Y$ is Hausdorff.
This gives us a hint: we are probably modding out by this subcomplex in order to deal with NON compact Hausdorff spaces.
Second, let's justify looking at the diagonal. The diagonal embedding $X \xrightarrow{\Delta} X \times X$, is simply a canonical way to embed a space X into an ambient space endowed with the product topology, $\Delta X := \{(x,x) \in X \times X\}$. It is useful when want to look in the neighborhood of a space $X$ (e.g., at germs of functions on $X$), but $X$ sits in no ambient space. The word, "diagonal embedding," comes from the example of embedding of $R^1 \hookrightarrow R^2$ taking $x \mapsto (x,x)$, that is, taking the line $R^1$ and embedding it into $R^2$ as the line $y=x$.
With this in mind, let's return our gaze to Alexander-Spanier cochains.
Here's my naive guess: modding out functions which disappear on any neighborhood of $X$, $N(X)$, artifically forces $X$ to satisfy the condition that $$H^q(\text{functions which disappear on }N(X)) \simeq H^q(\text{functions which disappear on }X)$$ for all $N$, all $q$, and all coefficient groups. Perhaps modding out by the subcomplex lets us "falsely" satisfy that $X$ is tautly embedded in $X \times X$, so that we may treat $X$ as if it were a compact space.
Below are a few additional comments toward why someone might have thought of modding out by that particular subcomplex.
Establishing notation: $X^{p+1}$ is the (p+1)-fold product of X with itself, that is, for $x_i \in X$, $(x_1, ..., x_{p+1}) \in X^{p+1}$.
$f^p(X) := \{$ functions $X^{p+1} \to \mathbb{Z} \}$, with functional addition as the group operation.
$f^p_0(X) :=$ elements of $f^p(X)$ which are zero in the neighborhood of the diagonal $\Delta X^{p+1}$
If we are examining functions defined pointwise on $X$, it’s natural to look at $X$-embedded in an ambient space, rather than the space $X$ itself. That is, $N(X)$ is the natural home of the jet bundle of $X$.
Functions which disappear on $N(X)$ form a group. If $f$ and $f’$ are both zero on $N(X)$ then $f-f’$ is zero on $N(X)$.
I'm not sure if the following is useful, nor how it fits into the story, but I figured I'd mention it.
The natural home of jet bundles (over a space $X$) is over the diagonal of X. From reading this paper, it seems that Grothendieck brought to the fore the kth neighborhood of the diagonal of a manifold $X$ when he was porting notions of differential geometry into algebraic geometry (this was then ported back into differential geometry by Spencer, Kumpera, and Malgrange). We'll use the standard notation $\Delta X \subseteq X_{(k)} \subseteq X \times X$. The only points of $X_{(k)}$ are the diagonal points $(x, x)$, but, we equip our space $X_{(k)}$ with a structure sheaf of functions, and treat $X_{(k)}$ as if it is made of "k-neighbor points" (x,y) where x and y are the closest points to one another, what Weil called "points proches").
To picture $X_{(1)}$, we might imagine $X$ with an infinitesimal normal bundle, for $X_{(2)}$, an infinitesimal bundle that’s ever so slightly larger of the second derivatives (as we need more local information to take the 2nd derivative), and so on.
If we think of a function $\omega: X_{(k)} \to R$ which vanishes on $X \subseteq X_{(k)}$ as a “differential k-form,” then maybe:
- the functions which vanish to the first order can be thought of as closed forms, $d\omega = 0$,
- the functions which vanish to the second order on the diagonal $X \subseteq X_{(k+1)}$ can be thought of as exact forms for they satisfy $\omega = d\beta$, s.t. $d(\omega) = d(d\beta) = 0$.
Cech cohomology groups can be defined for very general pairs $(X,A)$, but best results are obtained for compact (including Hausdorff!) pairs. This completely suffices for your question.
Cech cohomology has a number of interesting properties.
For compact polyhedral pairs (or more generally compact CW pairs) it agrees with singular cohomology.
Continuity: If $(X,A)$ is the inverse limit of an inverse system $(\mathbf X, \mathbf A)$ of compact pairs $(X_\alpha,A_\alpha)$, i.e. $(X,A) = \varprojlim (\mathbf X, \mathbf A)$, then $\check H^*(X,A)$ is the direct limit of Cech cohomology groups $\varinjlim \check H^*(\mathbf X, \mathbf A)$.
Strong excision: $\check H^*(X,A) = \check H^*(X/A,*)$. The latter is the reduced Cech cohomology of the quotient $X/A$.
The Hawaiian earring is the inverse limit of an inverse sequence $X_n= $ wedge of $n$ copies $S^1_i$ of the circle. The bonding maps $p_n : X_{n} \to X_{n-1}$ are the retractions mapping $S^1_n$ to the basepoint. Using 1. and 2. we get
$$\check H^1(X,S) = \check H^1(X) = \Sigma = \bigoplus_{i=1}^\infty \mathbb Z .$$
The first singular cohomology group of the Hawaiian earring is also $\bigoplus_{i=1}^\infty \mathbb Z$. To see this, we can use the exact universal cooeficient sequence
$$0 \to Ext(H_{n−1}(X),\mathbb Z) \to H^n(X) \to Hom(H_n(X),\mathbb Z) \to 0$$
For $n =1$ we have $H_0(X) = 0$, thus
$$H^1(X) \approx Hom(H_1(X),\mathbb Z).$$
The first singular homology group of the Hawaiian earring has the form $\Pi \oplus (\Pi/\Sigma)$, where $\Pi = \prod_{i=1}^\infty \mathbb Z$ is the Baer-Specker-group (see here). By the way, the first Cech homology group is $\Pi$. We therefore get
$$H^1(X) \approx Hom(\Pi,\mathbb Z) \oplus Hom(\Pi/\Sigma,\mathbb Z) .$$
The first summand is isomorphic to $\Sigma$ (see here). The second summand can be identified with the subgroup of $Hom(\Pi,\mathbb Z)$ annihalating $\Sigma$. It is therefore a free Abelian group of (at most) countably infinite rank.
Here is an example where you can see that in general you cannot expect $\check H^n(X,S) = H^n(X,S)$ in the top dimension.
Let $X = S^2$ and $S = W$ be a copy of the Warsaw circle. The singular cohomology of $W$ is trivial in all positive dimensions, its Cech cohomology agrees with that of $S^1$. Considering the long exact cohomology sequences of the pair $(S^2,W)$ we get
$$ 0 = H^1(W) \to H^2(S^2,W) \to H^2(S^2) = \mathbb Z \to H^2(W) = 0 ,$$
i.e. $H^2(S^2,W) = \mathbb Z$, and
$$ 0 = \check H^1(S^2) \to \check H^1(W) = \mathbb Z \to \check H^2(S^2,W) \to \check H^2(S^2) = \mathbb Z \to \check H^2(W) = 0 .$$
This short exact sequence splits and we get $\check H^2(S^2,W) = \mathbb Z \oplus \mathbb Z$.
Best Answer
This is only a partial answer.
In
one can find a proof that Alexander-Spanier cohomology and Cech cohomology agree for all pairs (see Theorem 2).
Wikipedia says
But in fact Cech cohomology groups agree with singular cohomology groups for all CW-complexes. Here are some references.
In
Morita, Kiiti. "Čech cohomology and covering dimension for topological spaces." Fundamenta Mathematicae 1.87 (1975): 31-52.
one finds a variant of Cech cohomology based on normal open covers instead of arbitrary open covers. In paracompact spaces all open covers are normal, thus the "usual" Cech cohomology groups agree with the "Morita-Cech" cohomology groups for paracompact spaces.
In
Mardešic, Sibe, and Jack Segal. Shape theory: the inverse system approach. Elsevier, 1982.
Morita's approach is taken up in modified form (Chapter II §3.2). The Morita-Cech cohomology groups of a space $X$ are defined as $\check H^n(X) = \varinjlim H^n(\mathbf X)$, where the inverse system $\mathbf X$ occurs in a so-called $HPOl$-expansion $\mathbf p : X \to \mathbf X$. One can take for example the Cech-expansion (based on normal coverings). If $X$ has the homotopy type of a polyhedron (wich is true for CW complexes and ANRs), then one can take the trivial $HPOl$-expansion consisting of the trivial inverse system $\mathbf X= (X)$ and $\mathbf p = id$. We see that $\check H^n(X) = H^n(X)$. Now observe that CW complexes are paracompact.