As it came out in the comments, your doubt about cellular cohomology of $\mathbb{RP}^n$ not being isomorphic to singular cohomology was because you switched the even and odd cases in $\partial$, as seen on Hatcher (current online edition) p.144.
In fact, cellular (co)homology is always isomorphic to singular (co)homology for CW complexes: Hatcher is again a good reference for this, see pages 139 for homology and 203 for cohomology. The proof for homology doesn't involve chain maps, quasi isomorphisms or chain homotopies, rather it hinges on an ad hoc diagram chasing argument, in which one uses the fact that the chain groups of the cellular chain complex are the relative singular homology gruops $H_i(X^i,X^{i-1})$. With $\mathbb{Z}$ coefficients it is true that there is a quasi isomorphism (indeed a chain homotopy) of the two complexes, since it os a general fact that for complexes of abelian groups, isomorphic homology is a sufficient condition for there to exist a chain homotopy (see here). I'm not sure whether this holds for some larger class of modules (like modules over PIDs), but for chain complexes of modules over a generic ring this is false in general, and I'm not sure if it turns out to be true in the case of singular and cellular complexes of CW complex (this is what the OP seems to be asking here).
Again, the standard proof of the equivalence of singular or cellular cohomology doesn't operate with maps at the level of chain complexes, and is similar to the proof for homology, to the extent that the bulk of the argument is just handled by using the universal coefficient theorem and the homological case, if I remember correctly. Hatcher not only shows that $H^\bullet(X;G) \simeq H^\bullet_{CW}(X;G)$, but also that the cellular cochain complex is the dual complex of the cellular chain complex, which isn't evident from the definition.
I hope all of this answers your questions. I could add details, but Hatcher is really a very good reference for all of this.
I'm not an expert, the following is all just guesswork -- I similarly found the original papers unenlightening wrt their motivation.
As you said, the mystery mainly lies in the motivation of the additional step: modding out the functions from $X^{k+1} \to R$ by the subcomplex of functions which disappear on the neighborhood of the diagonal.
First, let's justify looking at neighborhoods of a space. We know from Alexander duality the philosophy of looking at tautness of a subspace $U$ with respect to a space $Y$.
We look at neighborhood $N$ of $U$ in Y (by neighborhood, we mean a subset $N$ of $Y$ that contains $U$ in its interior). The intersection of two neighborhoods of $U$ in $Y$ will be another neighborhood of $U$ in $Y$, so this gives us a system of groups $\{H^q(N)\}$ where $N$ ranges over all neighborhoods of $U$ in $Y$.
For each $N$, this gives us an inclusion $U \in N$, which induces a homomorphism $H^q(N) \to H^q(U)$. The subspace $U$ is said to be "tautly embedded" in $Y$ if this is an isomorphism for all $q$, all $N$, and all coefficient groups. Being taut implies that $U$ is compact and $Y$ is Hausdorff.
This gives us a hint: we are probably modding out by this subcomplex in order to deal with NON compact Hausdorff spaces.
Second, let's justify looking at the diagonal. The diagonal embedding $X \xrightarrow{\Delta} X \times X$, is simply a canonical way to embed a space X into an ambient space endowed with the product topology, $\Delta X := \{(x,x) \in X \times X\}$. It is useful when want to look in the neighborhood of a space $X$ (e.g., at germs of functions on $X$), but $X$ sits in no ambient space. The word, "diagonal embedding," comes from the example of embedding of $R^1 \hookrightarrow R^2$ taking $x \mapsto (x,x)$, that is, taking the line $R^1$ and embedding it into $R^2$ as the line $y=x$.
With this in mind, let's return our gaze to Alexander-Spanier cochains.
Here's my naive guess: modding out functions which disappear on any neighborhood of $X$, $N(X)$, artifically forces $X$ to satisfy the condition that $$H^q(\text{functions which disappear on }N(X)) \simeq H^q(\text{functions which disappear on }X)$$ for all $N$, all $q$, and all coefficient groups. Perhaps modding out by the subcomplex lets us "falsely" satisfy that $X$ is tautly embedded in $X \times X$, so that we may treat $X$ as if it were a compact space.
Below are a few additional comments toward why someone might have thought of modding out by that particular subcomplex.
Establishing notation: $X^{p+1}$ is the (p+1)-fold product of X with itself, that is, for $x_i \in X$, $(x_1, ..., x_{p+1}) \in X^{p+1}$.
$f^p(X) := \{$ functions $X^{p+1} \to \mathbb{Z} \}$, with functional addition as the group operation.
$f^p_0(X) :=$ elements of $f^p(X)$ which are zero in the neighborhood of the diagonal $\Delta X^{p+1}$
If we are examining functions defined pointwise on $X$, it’s natural to look at $X$-embedded in an ambient space, rather than the space $X$ itself. That is, $N(X)$ is the natural home of the jet bundle of $X$.
Functions which disappear on $N(X)$ form a group. If $f$ and $f’$ are both zero on $N(X)$ then $f-f’$ is zero on $N(X)$.
I'm not sure if the following is useful, nor how it fits into the story, but I figured I'd mention it.
The natural home of jet bundles (over a space $X$) is over the diagonal of X. From reading this paper, it seems that Grothendieck brought to the fore the kth neighborhood of the diagonal of a manifold $X$ when he was porting notions of differential geometry into algebraic geometry (this was then ported back into differential geometry by Spencer, Kumpera, and Malgrange). We'll use the standard notation $\Delta X \subseteq X_{(k)} \subseteq X \times X$. The only points of $X_{(k)}$ are the diagonal points $(x, x)$, but, we equip our space $X_{(k)}$ with a structure sheaf of functions, and treat $X_{(k)}$ as if it is made of "k-neighbor points" (x,y) where x and y are the closest points to one another, what Weil called "points proches").
To picture $X_{(1)}$, we might imagine $X$ with an infinitesimal normal bundle, for $X_{(2)}$, an infinitesimal bundle that’s ever so slightly larger of the second derivatives (as we need more local information to take the 2nd derivative), and so on.
If we think of a function $\omega: X_{(k)} \to R$ which vanishes on $X \subseteq X_{(k)}$ as a “differential k-form,” then maybe:
- the functions which vanish to the first order can be thought of as closed forms, $d\omega = 0$,
- the functions which vanish to the second order on the diagonal $X \subseteq X_{(k+1)}$ can be thought of as exact forms for they satisfy $\omega = d\beta$, s.t. $d(\omega) = d(d\beta) = 0$.
Best Answer
The idea is that a cochain $\varphi \in C^n(X)$ is compactly supported if there's a $K \subseteq X$ compact subset of $X$ such that $\varphi|_{C_n(X \setminus K)} = 0$.
Edit a little remark: for every $K$ compact subset of $X$ there's an embedding $i \colon X \setminus K \hookrightarrow X$ which give rise to a injective embedding of chain complexes $i_* \colon C_\bullet(X \setminus K) \to C_\bullet (X)$, so we can think of $C_n(X \setminus K)$ as being a submodule of $C_n(X)$ and to be exact what I meant above by $\varphi|_{C_(X \setminus K)}$ should be written more formally as $\varphi|_{i_*(C_n(X \setminus K))}$.
So compactly supported co-chain of $X$ are those co-chains in $C^\bullet(X)$ that vanish on all the simplexes that have image contained in a subspace $X \setminus K$ (for some $K$ compact subset of $X$), i.e. those simplexes $\sigma \colon \Delta^n \to X$ that factors through the inclusion map $i \colon X \setminus K \to X$.
You can find out more about this in Hatcher's book Algebraic Topology.