The principle of thinning is well-suited for this type of problem. It roughly says the following. Suppose you have a Poisson process with intensity $\lambda$, and with each arrival you flip a coin independently at random to determine whether the arrival is of type I or type II. It has probability $p$ of being type I, and probability $1 - p$ of being type II. Then this Poisson Process can be decomposed into two independent Poisson Processes describing type I and type II arrivals. The former has intensity $p \lambda$, and the latter $(1 - p) \lambda$.
Applying this to your problem tells us that the arrivals of M104 buses follows a Poisson process of intensity $\frac{3}{5} \lambda$. The first arrival time is exponentially distributed with intensity according to the corresponding Poisson Process intensity, so the probability that you will have to wait more than $x$ hours for the first M104 bus is just $\mathbb{P}(\text{Expon}(3\lambda/5) > x)$.
You may also find useful for similar problems the principle of superposition for Poisson processes, which states the following. Given two independent Poisson processes with intensities $\lambda$ (for type I arrivals) and $\mu$ (for type II arrivals), the combined arrivals process is given by a Poisson process of intensity $\lambda + \mu$. This is effectively the converse of the principle of thinning.
The number of patients treated is cumulative, so the outcome of interest here is that the doctor has treated three additional patients between $8$ and $10$ a.m.
One of the properties of the Poisson process is that, at any given point in time, the occurrence of future events is independent of any past events. So all you have to do is compute the probability that the number of arrivals $X(t)$ in the next $t = 2$ hours equals $3$, when the rate is $\lambda = 2$ arrivals per hour. This is given by $$\Pr[X(2) = 3] = e^{-\lambda t} \frac{(\lambda t)^3}{3!} = \frac{32}{3e^{4}} \approx 0.195367.$$
But suppose we don't do it this way, and try to model the problem more along the lines of what you were thinking. Then let $A$ be the event that exactly $6$ arrivals occurred by time $t = 2$ hours, and let $B$ be the event that exactly $9$ arrivals have occurred by time $t = 4$ hours. Then $$\Pr[B \mid A] = \frac{\Pr[B \cap A]}{\Pr[A]}.$$ The denominator is easy; with $\lambda t = 2(2) = 4$, we have $$\Pr[A] = e^{-4} \frac{4^6}{6!} \approx 0.104196.$$ This is the same as what you computed. Where we differ is the numerator; in particular, $$\Pr[B \cap A] \ne \Pr[B].$$ The reason is because the event $B \cap A$ excludes outcomes in which, say, only $5$ arrivals occurred in the first two hours but $4$ occurred in the next two hours; whereas $\Pr[B]$ considers whether a total of $9$ arrivals occurred within the first four hours irrespective of their distribution within the first and second halves of the four-hour interval. That's why your probability exceeds $1$: your numerator becomes too large because $\Pr[B]$ includes events you should be excluding.
This leads us to conclude that the intuitive way to compute $\Pr[B \cap A]$ is to reason that $\Pr[B \cap A] = \Pr[B \mid A]\Pr[A]$ and $\Pr[B \mid A] = \Pr[X(2)=3]$, which is of course circular because what we were after from the beginning was $\Pr[B \mid A]$. So is there some other way? Well, yes, but it's not simple. The idea is to consider the distribution of the $9$ arrivals in the first four hours according to whether they happen in the first two or last two hour intervals. In other words, we think of how many ways they can be distributed between the two intervals, and count only the one where $6$ happen in the first half and $3$ in the second.
Because the intervals have equal length, given that an arrival occurred in the first four hours, the probability that it occurs in the first half equals the probability it occurs in the second. So the probability that exactly $Y = 6$ events occurred in the first half is a binomial probability with $n = 9$ and $p = 1/2$: $$\Pr[Y = 6] = \binom{9}{6} (1/2)^6 (1 - 1/2)^3 = \frac{21}{128} \approx 0.164063.$$ Then we have to also multiply this by the Poisson probability of seeing $9$ events in the first place, i.e. $$Pr[B \cap A] = \Pr[Y = 6]\Pr[B] \approx 0.164063(0.124077) \approx 0.0203564.$$ And when we finish the calculation, you find $$\Pr[B \mid A] = \frac{0.0203564}{0.104196} \approx 0.195367,$$ which matches our original calculation at the beginning.
What the equation $\Pr[B \cap A] = \Pr[Y = 6]\Pr[B]$ suggests is that in fact, $$\Pr[Y = 6] = \frac{\Pr[B \cap A]}{\Pr[B]} = \Pr[A \mid B].$$ This is the conditional probability that, if $9$ events were observed in the first $4$ hours, that $6$ of them happened in the first two hours. And this is indeed binomially distributed, but we did not really formally prove this, resorting only to intuition.
Best Answer
This is a preparatory exercise befor approaching Hypothesis Testing. The fact that they chose $\lambda=1$ instead of $\lambda \leq 1$ is due to the fact that the null hypothesis (the device is satisfactory) plays a main role in the system, say it is always true until you reject is on the basis of your observation's results. Thus in this situation you chose the most suitable value for the parameter to be assigned to $H_0$, the null hypothesis. In other words, the two following systems are equivalent:
$$\begin{cases} H_0: & \lambda\leq1 \\ H_1: & \lambda>1 \end{cases}$$
$$\begin{cases} H_0: & \lambda=1 \\ H_1: & \lambda>1 \end{cases}$$