For the first problem, you did not quite use inclusion/exclusion. Let us count the number that contain the substring $121$. From your $6\cdot 9^5$ we need to subtract the sequences that have been double-counted by $6\cdot 9^5$.
How many sequences are there that have two or more substrings of shape $121$? It is trickier than it looks. We have counted $121121xy$ twice, but we also have counted $12121xyz$ twice.
After you do the subtraction, it will turn out you have subtracted too much, for example the sequences $1212121x$, so they will have to be added back.
I think $(3)$ is misleading. Since $E_k$ is the set of all sample points in exactly $k$ of the $A_i$ sets, $(3)$ makes no sense at all to me.
For $(3)$ I suggest we let each $E_k$ be a disjoint union:
$$E_k = \bigcup_{i=1}^{\binom{n}{k}} D_{k,i}$$
where each $D_{k,i}$ is the subset of $E_k$ where the $k$ $A_i$'s are a distinct $k$ of the sets $A_1,\ldots,A_n$.
Then, for $(3),\;$ we assume WLOG, for any given $k,i,\;$ that $D_{k,i}$ is contained in $A_1,\ldots,A_k$ (and in no others) and proceed as I think the author intends, with $D_{k,i}$ instead of $E_k$. Note that, by the additivity of probability:
$$P(E_k) = \sum_{i=1}^{\binom{n}{k}} P(D_{k,i}).$$
The idea of $(5)$ is that we have
\begin{eqnarray*}
P\left(\bigcup_{i=1}^{n}A_i\right) &=& \sum_{k=1}^{n} P(E_k) \qquad\qquad\text{by $(1)$} \\
&=& \sum_{k=1}^{n} \sum_{i=1}^{\binom{n}{k}} P(D_{k,i}) \qquad\text{by $(3)$}\qquad\qquad\qquad\text{(*)} \\
\end{eqnarray*}
By $(3),\;$ any $P(D_{k,i})$ appears in the expression $P_1-P_2+P_3-\cdots\pm P_k,\;$ exactly $\left(k-\binom{k}{2}+\binom{k}{3}-\cdots\pm\binom{k}{k}\right)$ times, and this equals $1$ by $(4)$.
So, from $(*),\;$ we can conclude that $P(\cup_{i=1}^{n}A_i) = P_1-P_2+P_3-\cdots\pm P_n,\;$ as required.
Best Answer
For a generalized inclusion-exclusion theorem, see this answer. In that answer, the Theorem says that the number of items in exactly $k$ of the sets is $$ \sum_{j=0}^m(-1)^{j-k}\binom{j}{k}N(j) $$ Corollary 2 says that the number of items in at least $k$ of the sets is $$ \bbox[5px,border:2px solid #C0A000]{\sum_{j=k}^m(-1)^{j-k}\binom{j-1}{j-k}N(j)} $$ where $\binom{-1}{n}=$$(-1)^n\binom{n}{n}$$=(-1)^n$$[n\ge0]$.
Proof of the Identity
If $k\ge s$, then $$ \begin{align} \sum_{i=0}^{k-s}(-1)^i\binom{s-1+i}{s-1}\binom{k}{s+i} &=\sum_{i=0}^{k-s}(-1)^i\binom{s-1+i}{i}\binom{k}{k-s-i}\tag{1}\\ &=\sum_{i=0}^{k-s}\binom{-s}{i}\binom{k}{k-s-i}\tag{2}\\ &=\binom{k-s}{k-s}\tag{3}\\[8pt] &=1\tag{4} \end{align} $$ Explanation:
$(1)$: $\binom{n}{k}=\binom{n}{n-k}$ for $n\ge0$ and $n\in\mathbb{Z}$
$(2)$: $\binom{-n}{k}=(-1)^k\binom{n+k-1}{k}$
$(3)$: Vandermonde's Identity
$(4)$: $\binom{n}{n}=1$ for $n\ge0$ and $n\in\mathbb{Z}$