The two constructions are equivalent and their equivalence is based on the so-called thinning of Poisson processes.
Klenke starts from a homogenous Poisson process with a large rate $\lambda$. Amongst the times of this process, when at $x$, a relative proportion $1+q(x,x)/\lambda$ is used to jump from $x$ to $x$ and, for every $y\ne x$, a relative proportion $q(x,y)/\lambda$ is used to jump from $x$ to $y$. The jumps $x\to x$ have no effect, hence one is left with a proportion $q(x,y)/\lambda$ of jumps $x\to y$ amongst a global population of potential jump times with density $\lambda$, that is, the correct rate $q(x,y)$.
The only condition for this construction to work is $1+q(x,x)/\lambda\geqslant0$ for every $x$, that is, $\lambda\geqslant\sup\limits_x[-q(x,x)]$, hence one can choose, as many authors do, $\lambda=\sup\limits_x[-q(x,x)]$ but any larger value of $\lambda$ will do as well.
Norris's construction might be more usual hence I will not comment on it here, except to note that $\lambda$, the initial distribution in Norris, is related in no way whatsoever to $\lambda$, the positive real number in Klenke. (My impression is that Klenke's version, more elegant, is slowly replacing the other one in the probabilists' minds.)
Edit The piece of Norris's construction missing from your account is that $\Pi$ is related to $Q$ through $\Pi(x,x)=0$ for every $x$, and, for every $y\ne x$,
$$
\Pi(x,y)=\frac{q(x,y)}{q(x)}\quad\text{with}\quad q(x)=-q(x,x)=\sum_{z\ne x}q(x,z).
$$
I convert @lan's comment as answer to close this question.
Do you mean the time when the jump occurs, or the increment in time between two jumps? In the case of the latter, it is definitely not a stopping time except when $n=0$, but the reason is kind of obvious (you're comparing apples to oranges, since the definition wants you to restrict to times measured relative to the start of the process but you are really measuring times relative to the arrival at the current state).
That being said, the actual time of the jump itself is a stopping time provided that you've defined your conventions consistently (i.e. all the inequalities involved are nonstrict).
Best Answer
\begin{align} \sum_{i=0}^d{d\choose i}\left(\frac{\lambda}{\mu}\right)^{d+i}&= \left(\frac{\lambda}{\mu}\right)^d \sum_{i=0}^d{d\choose i}\left(\frac{\lambda}{\mu}\right)^i\\ &= \left(\frac{\lambda}{\mu}\right)^d\left(1+ \frac{\lambda}{\mu}\right)^d\\ &= \left(\frac{\lambda}{\mu}+ \left(\frac{\lambda}{\mu}\right)^2\right)^d\\ &=1 \end{align} if and only if $\ \frac{\lambda}{\mu} =\frac{\sqrt{5}-1}{2}\ $. Your proposed answer cannot , therefore, be correct. You can also see this from the fact that if $\ Y_t\ $ is the number of particles in container $\ 0\ $ at time $\ t\ $, then, from the symmetry of the problem, the stationary distribution $\ \rho_j\ $ of $\ Y_t\ $ must be given by $$ \rho_j={d\choose j}\left(\frac{\mu}{\lambda}\right)^{d+j}\ . $$ But since $\ Y_t=d-X_t\ $ then we must also have $\ \rho_j=$$\,\pi_{d-j}=$$ {d\choose d-j}\left(\frac{\lambda}{\mu}\right)^{2d-j}\ $ for all $ j\ $, which will only be the case if $\ \lambda=\mu\ $.