The discrepancy results from the fact that you used the geometric distribution supported on the set $\{1,2,3,\ldots\}$ (number of trials needed to get one success, which is $1$ if there's a success on the first trial), whereas the article used the geometric distribution supported on the set $\{0,1,2,3,\ldots\}$ (number of trials before the first success, which is $0$ if there's a success on the first trial). Also, which "negative binomial distribution" are you using? The the distribution of the number of trials needed to get $r$ successes (counting both the successes and the failures among the trials) or the distribution of the number of failures before the $r$th success (or of successes before the $r$th failure, which is how the article phrases it)? The latter distribution is supported on $\{0,1,2,3,\ldots\}$ and the former on $\{r,r+1,r+2,\ldots\}$. The latter is more interesting in some ways because it allows $r$ to be a non-integer and gives you an actually infinitely divisible family of probability distributions.
Since we are looking at an infinite number of trials, after removing a finite prefix of trials, we still have an infinite number of trials. Therefore, we can simply ignore this prefix for the probability of $E$ (even the conditions for $E$ would already be fulfilled by the prefix) and therefore $P(E) = P(E \mid B)$ for any event $B$ that restricts the outcomes of the trials only for a finite prefix.
This hopefully answers your question, but the partial solution you gave above seems rather complicated to me. Thankfully, I was able to find a pdf version of the book via Google. There, the author gives a rather complicated probability for $P(E)$ (not included in your question). That surprised me, because intuitively I thought, $P(E) = 1$.
Thinking about that in more detail, I am now convinced that, indeed, $P(E) = 1$: Let $A$ be the event, that we consecutively have $n$ successes and $m$ failures in $n + m$ trials. Clearly, $P(A) ≠ 0$, because $P(A) = p^n ⋅ q^m$ (assuming $0 < p < 1$).
Now let us iterate these $n + m$ trials and let $E'$ be the event that event $A$ occurs in the first block of $n + m$ trials, or in the second block of $n + m$ trials, or in the third block, etc. Since $P(A) ≠ 0$ and therefore $P(\overline{A}) ≠ 1$ it is easy to see that $P(\overline{E'}) = 0$, since we are looking at an infinite number of trials. Therefore we have $P(E') = 1$.
Since $E' ⊆ E$, it follows that also $P(E) = 1$.
For short: In an infinite sequence of trials you are able to find any finite pattern with probability $1$.
Therefore I suspect that there is a mistake in the solution in the book – unless I made a mistake myself or misunderstood the problem.
Best Answer
Negative binomial distribution is not applicable here. It calculates probability to get $x+r$ trials until total number of successes reach $r$. And there successes need not to be consecutive. And you are interested in $c$ consecutive successes. So your trials must ended with $c$ successes. Say, for
$x=2$: only $(SS)$ is valid,
$x=3$: only $(FSS)$,
$x=4$: either $(SFSS)$ or $(FFSS)$,
$x=5$: $(FFFSS)$ or $(FSFSS)$ or $(SFFSS)$
and so on. Note that before two last successes you must get failure. So for $x$ trials the last three places are occupied and the previous $x-3$ places are those that experiment cannot be finished earlier: $$ P(x)=\mathbb P(\underbrace{**\ldots*}_{x-3}FSS) =(1-P(2)-\ldots-P(x-3))\cdot (1-p)p^2. $$ Here $1-P(2)-\ldots-P(x-3)$ is the probability that the first $x-3$ trials do not contain two consecutive successes and experiment was not finished for $x-3$ trials or earlier.
So $P(2)=p^2$, $P(3)=(1-p)p^2=P(4)$, $$ P(5)=(1-P(2))\cdot(1-p)p^2, $$ $$ P(6)=(1-P(2)-P(3))\cdot(1-p)p^2, $$ $$ P(7)=(1-P(2)-P(3)-P(4))\cdot(1-p)p^2, $$ $$ P(8)=(1-P(2)-P(3)-P(4)-P(5))\cdot(1-p)p^2. $$