Technicality in proof of $\binom{m+n}{l} = \sum_{k=0}^l \binom{m}{k}\binom{n}{l-k}$

abstract-algebrabinomial-coefficientsring-theory

This is from Analysis I by Herbert Amann, Joachim Escher. I want to make sure I understand everything correctly, so I'm sorry if this seems nitpicky.

After introducing formal power series $R[X]$ (functions in $R^{\mathbb{N}}$) of a ring $R$ with unity and polynomials as a subring of $R[X]$, there is a proof of the identity
$$\binom{m+n}{l} = \sum_{k=0}^l \binom{m}{k}\binom{n}{l-k},\quad l,m,n\in\mathbb N$$
as an application of $R[X]$ being a ring.
Let $X$ denote the polynomial with $x_1=1$ and $x_i=0$ for $i\neq 1$.
Their proof is as follows:

Since $X$ and $1\in R[X]$ commute, we can use the binomial theorem for rings to compute
$$(1+X)^j=\sum_{i=0}^j\binom{j}{i}X^i,\quad j\in\mathbb{N}.$$
Now we compute the two sides of $(1+X)^m(1+X)^n=(1+X)^{m+n}$.
We have
$$\begin{align}(1+X)^m(1+X)^n &= \left(\sum_{k=0}^m\binom{m}{k}X^k\right)\left(\sum_{j=0}^n\binom{n}{j}X^j\right)\\
&=\sum_l\left( \sum_{k=0}^l\binom{m}{k}\binom{n}{l-k} \right)X^l\end{align}\tag{A}$$

where the second equality is the definition of multiplication of polynomials. Also
$$(1+X)^{m+n}=\sum_{l=0}^{m+n}\binom{m+n}{l}X^l.\tag{B}$$
Comparing coefficients in (A) and (B) gives the identity.

My issue is that the binomial coefficients which lie in $\mathbb N$ are not technically the coefficients of the polynomial, which lie in $R$. Given $r\in R$ and $n\in\mathbb N$, $n\cdot r$ is the $n$-fold sum of $r$. So, for example, isn't the $l$th coefficient of the polynomial in (B) really $\binom{m+n}{l}\cdot 1_R$?
Then the proof is really asserting that
$$\binom{m+n}{l}\cdot 1_R = \sum_{k=0}^l \binom{m}{k}\binom{n}{l-k}\cdot 1_R$$
for any ring $R$.

If my understanding is correct so far, I think letting $R=\mathbb Z$ recovers the original identity since $n\cdot 1_{\mathbb Z}=n$. It's just that the integers haven't been introduced yet.

Best Answer

You are technically correct. Every ring $R$ has a canonical unit map $\mathbb{Z} \to R$ given by taking multiples of the unit, and it's an extremely common abuse of notation to write $n \in R$ when we mean $n \cdot 1_R \in R$. If $R$ has positive characteristic (meaning that this map has nontrivial kernel) then $n \cdot 1_R = m \cdot 1_R$ need not imply $n = m$.

Of course everything is fine if we set $R = \mathbb{Z}$, as you say. I have no idea why the authors would want to work in the generality of $R$ an arbitrary ring just to prove Vandermonde's identity. There are fun things you can do with binomial coefficients $\bmod p$ that involve taking $R = \mathbb{F}_p$, though.