Prove the base case for $n=2$. So we have $\overline{A_1 \cup A_2} = \overline{A_1} \cap \overline{A_2}$.
Assume it is true for $n=m$; i.e., $\overline{A_1 \cup A_2 \cup \ldots A_m}$ $=\overline{A_1} \cap \overline{A_2} \cap \ldots \overline{A_m}$.
Now, let $B = \overline{A_1 \cup A_2 \cup \ldots A_m}$. Then, $B =\overline{A_1} \cap \overline{A_2} \cap \ldots \overline{A_m}$ (by assumption).
So, $\overline{A_1 \cup A_2 \cup \ldots A_m \cup A_{m+1}}$ $= \overline{B \cup A_{m+1}}$ $= \overline{B} \cap \overline{A_{m+1}}$ (by the base case)
$=\overline{A_1 \cup A_2 \cup \ldots A_m} \cap \overline{A_3}$
$=\overline{A_1} \cap \overline{A_2} \cap \ldots \overline{A_m} \cap \overline{A_3}$
Proving base case is easy:
$x \in \overline{A_2 \cup A_2}$ $\iff x \notin A_1 \cup A_2$ $\iff x\notin A_1 \wedge x\notin A_2$ $\iff x \in \overline{A_1} \wedge x \in \overline{A_2}$ $\iff x \in \overline{A_1} \cap \overline{A_2}$
You’re quite right that the two-set case is used in the induction step, but it’s still a little tricky to work through.
Suppose that you know it for $n$. Let $B=A_1\cup\ldots\cup A_n$. Then
$$\begin{align*}
|A_1\cup\ldots\cup A_{n+1}|&=|B\cup A_{n+1}|\\
&=|B|+|A_{n+1}|-|B\cap A_{n+1}|\\
&=\sum_{k=1}^n(-1)^{k+1}\left(\sum_{1\le i_1<\ldots<i_k\le n}|A_{i_1}\cap\ldots\cap A_{i_k}|\right)+|A_{n+1}|-|B\cap A_{n+1}|\;.\tag{1}
\end{align*}$$
Now
$$\begin{align*}
|A_{n+1}|-|B\cap A_{n+1}|&=|A_{n+1}|-\left|\left(\bigcup_{k=1}^nA_k\right)\cap A_{n+1}\right|\\
&=|A_{n+1}|-\left|\bigcup_{k=1}^n(A_k\cap A_{n+1})\right|\\
&=|A_{n+1}|-\sum_{k=1}^n(-1)^{k+1}\left(\sum_{1\le i_1<\ldots<i_k\le n}|(A_{i_1}\cap A_{n+1})\cap\ldots\cap(A_{i_k}\cap A_{n+1})|\right)\\
&=|A_{n+1}|+\sum_{k=1}^n(-1)^{k+2}\left(\sum_{1\le i_1<\ldots<i_k\le n}|A_{i_1}\cap\ldots\cap A_{i_k}\cap A_{n+1}|\right)\\
&=\sum_{k=1}^{n+1}(-1)^{k+1}\left(\sum_{1\le i_1<\ldots<i_{k-1}\le n<i_k=n+1}|A_{i_1}\cap\ldots\cap A_{i_k}|\right)\tag{2}
\end{align*}$$
by the induction hypothesis applied to the $n$ sets $A_1\cap A_{n+1},\ldots,A_n\cap A_{n+1}$.
To see what’s really going on here, it’s helpful to realize that the sum $(2)$ covers all of the intersections of $A_i$s that include the new $A_{n+1}$, while the first term of $(1)$ covers the ones that don’t include $A_{n+1}$. Combining results, we now have
$$\begin{align*}
\left|\,\bigcup_{k=1}^{n+1}A_k\,\right|&=\sum_{k=1}^n(-1)^{k+1}\left(\sum_{1\le i_1<\ldots<i_k\le n}|A_{i_1}\cap\ldots\cap A_{i_k}|\right)\\
&\qquad+\sum_{k=1}^{n+1}(-1)^{k+1}\left(\sum_{1\le i_1<\ldots<i_{k-1}\le n<i_k=n+1}|A_{i_1}\cap\ldots\cap A_{i_k}|\right)\\
&=\sum_{k=1}^{n+1}(-1)^{k+1}\left(\sum_{1\le i_1<\ldots<i_k\le n+1}|A_{i_1}\cap\ldots\cap A_{i_k}|\right)\;,
\end{align*}$$
as desired.
Best Answer
For three sets (events), it is intuitively clear why the formula works: If we take the measure (probability) of the union then we count the pairwise intersections twice. So we subtract the measure (probability) of the intersections. But then we subtracted the measure (probability) of the triple intersection. So we have to add it.
If there are more sets (events) then you can argue the similar way. We added the measure of the pairwise intersections twice. So subtract them. But then we subtracted the measure (probability) of the triple intersections. So let's add them. But now we added the measure of the four-tuples of intersections. So subtract them...