[Math] How is $(x+y)^p \equiv x^p + y^p$ mod $p$ for any prime number $p$

abstract-algebracongruence-relationsmodular arithmeticnumber theory

I'm currently studying for an exam and on the practice test given to us (with solutions), there is a problem that states the following:

Notice that for all $x,y \in \Bbb Z$,

$(x+y)^2 = x^2 + 2xy + y^2 \equiv x^2 + y^2$ mod $2$

$(x+y)^3 = x^3 + 3x^2y + 3xy^2 + y^3 \equiv x^3 + y^3$ mod $3$

$(x+y)^5 = x^5 + 5x^4y + 10x^3y^2 + 10x^2y^3 + 5xy^4 + y^5 \equiv x^5+y^5$ mod $5$

Prove that if $p$ is any prime number then for all $x,y \in \Bbb Z$ we have $(x+y)^p \equiv x^p + y^p$ mod $p$.

Here is the proof that was provided to us in the solutions:

Let $p$ be a prime number. If $k$ is an integer satisfying $1 \leq k \leq p-1$, then $k! = 1\cdot2\cdot3\cdot\cdot\cdot k$ is a product of positive integeres smaller than $p$, and therefore $k!$ is not divisible by $p$. For the same reason, if $1 \leq k \leq p-1$ then $(p-k)! = 1\cdot2\cdot3\cdot\cdot\cdot (p-k)$ is not divisible by $p$. Since $p!$ obviously is divisible by $p$, we infer that

$\binom{p}{k} = \frac{p!}{k!(p-k)!} \equiv 0$ mod $p$ whenever $1 \leq k \leq p-1$.

Therefore

$(x + y)^p = x^p + y^p +\sum_{k=1}^{p-1} \binom{p}{k}x^k y^{p-k} \equiv x^p +y^p$ mod $p$.

Q.E.D.

I'm having trouble understanding a few things about this theorem and proof.

First, I'm not quite seeing the congruence (if that makes any sense). Like, I'm not seeing how $(x+y)^2 = x^2 + 2xy + y^2 \equiv x^2 + y^2$ mod $2$, or rather just in general for any power $p$, not just $2$.

I recall that we say $a \mid b$ if there exists some $c$ such that $ac = b$, and that we say if $a,b \in \Bbb Z$ and $n \in \Bbb N$, $a$ is congruent to $b$ modulo $n$ if and only if $n\mid (b-a)$. This is written as $a \equiv b$ (mod $n$) where $n$ is called the modulus, and from this we get residue classes $r$ that are all the sets of numbers of the form $a = bn + r$ where $r \in \{0,1,2,\cdot\cdot\cdot, n-1\}$. I'm not quite sure how to phrase that but I can picture what it's supposed to be in my head.

I saw that in an answer listed here Proving $(a+b)^{p} \equiv a^{p} + b^{p} \pmod{p}$ for prime $p$ that someone says "because nothing in the denominator divides $n$. So all of the middle terms in the expansion of $(a+b)^p$ vanish modulo $p$, and you're left with the two ends: $a^p+b^p$." But it still isn't making sense to me. How is it that because the denominator doesn't divide the numerator, all the messy middle terms disappear?

The next question I have is why go about proving it this way? What is the motivation for the techniques used to prove this statement? When I first saw this problem, admittedly I didn't know where to start. Part of me thought about induction but that was quickly ruled out based on the fact we're only considering prime powers. I did however immediately recognize that we're using binomial expansion and pascal's triangle, but that's about it.

I apologize for the long winded post and lots of questions, but this problem currently isn't making a lot of sense to me, so I would be incredibly grateful if somebody could explain this proof and its ideas to me. Thank you!

Best Answer

If you are familiar with it, this problem follows immediately from Fermat Little Theorem. Indeed, by FLT you have $$(x+y)^p \equiv x+y \pmod{p} \\x^p \equiv x \pmod{p} \\y^p \equiv y \pmod{p}$$

therefore $$(x+y)^p \equiv x+y \equiv x^p+y^p \pmod{p}$$

Related Question