The Bayesian test for your question is based on the integrated (rather than maximised) likelihood. So for Poisson we have:
$$\begin{array}{c|c}
H_{1}:\lambda_{1}=\lambda_{2} & H_{2}:\lambda_{1}\neq\lambda_{2}
\end{array}
$$
Now neither hypothesis says what the parameters are, so the actual values are nuisance parameters to be integrated out with respect to their prior probabilities.
$$P(H_{1}|D,I)=P(H_{1}|I)\frac{P(D|H_{1},I)}{P(D|I)}$$
The model likelihood is given by:
$$P(D|H_{1},I)=\int_{0}^{\infty} P(D,\lambda|H_{1},I)d\lambda=\int_{0}^{\infty} P(\lambda|H_{1},I)P(D|\lambda,H_{1},I)\,d\lambda$$
$$=\int_{0}^{\infty} P(\lambda|H_{1},I)\frac{\lambda^{x_1+x_2}\exp(-2\lambda)}{\Gamma(x_1+1)\Gamma(x_2+1)}\,d\lambda$$
where $P(\lambda|H_{1},I)$ is the prior for lambda. A convenient mathematical choice is the gamma prior, which gives:
$$P(D|H_{1},I)=\int_{0}^{\infty} \frac{\beta^{\alpha}}{\Gamma(\alpha)}\lambda^{\alpha-1}\exp(-\beta \lambda)\frac{\lambda^{x_1+x_2}exp(-2\lambda)}{\Gamma(x_1+1)\Gamma(x_2+1)}\,d\lambda$$
$$=\frac{\beta^{\alpha}\Gamma(x_1+x_2+\alpha)}{(2+\beta)^{x_1+x_2+\alpha}\Gamma(\alpha)\Gamma(x_1+1)\Gamma(x_2+1)}$$
And for the alternative hypothesis we have:
$$P(D|H_{2},I)=\frac{\beta_{1}^{\alpha_{1}}\beta_{2}^{\alpha_{2}}\Gamma(x_1+\alpha_{1})\Gamma(x_2+\alpha_{2})}{(1+\beta_{1})^{x_1+\alpha_{1}}(1+\beta_{2})^{x_2+\alpha_{2}}\Gamma(\alpha_{1})\Gamma(\alpha_{2})\Gamma(x_1+1)\Gamma(x_2+1)}$$
Now if we assume that all hyper-parameters are equal (not an unreasonable assumption, given that you are testing for equality), then we have an integrated likelihood ratio of:
$$\frac{P(D|H_{1},I)}{P(D|H_{2},I)}=
\frac{(1+\beta)^{x_1+x_2+2\alpha}\Gamma(x_1+x_2+\alpha)\Gamma(\alpha)}
{(2+\beta)^{x_1+x_2+\alpha}\beta^{\alpha}\Gamma(x_1+\alpha)\Gamma(x_2+\alpha)}
$$
Which you can see that the prior information is still very important. We can't set $\alpha$ or $\beta$ equal to zero (Jeffrey's prior), or else $H_{1}$ will always be favored, regardless of the data. One way to get values for them is to specify prior estimates for $E[\lambda]$ and $E[\log(\lambda)]$ and solve for the parameters - this cannot be based on $x_1$ or $x_2$ but can be based on any other relevant information. You can also put in a few different (reasonable) values for the parameters and see what difference it makes to the conclusion. The numerical value of this statistic tells you how much the data and your prior information about the rates in each hypothesis support the hypothesis of equal rates. This explains why the likelihood ratio test is not always reliable - because it essentially ignores prior information, which is usually equivalent to specifying Jeffrey's prior. Note that you could also specify upper and lower limits for the rate parameters (this is usually not too hard to do given some common sense thinking about the real world problem). Then you would use a prior of the form:
$$p(\lambda|I)=\frac{I(L<\lambda<U)}{\log\left(\frac{U}{L}\right)\lambda}$$
And you would be left with a similar equation to that above but in terms of incomplete, instead of complete gamma functions.
For the binomial case things are much simpler, because the non-informative prior (uniform) is proper. The procedure is similar to that above, and the integrated likelihood for $H_{1}:p_{1}=p_{2}$ is given by:
$$P(D|H_{1},I)={n_1 \choose x_1}{n_2 \choose x_2}\int_{0}^{1}p^{x_1+x_2}(1-p)^{n_1+n_2-x_1-x_2}\,dp$$
$$={n_1 \choose x_1}{n_2 \choose x_2}B(x_1+x_2+1,n_1+n_2-x_1-x_2+1)$$
And similarly for $H_{2}:p_{1}\neq p_{2}$
$$P(D|H_{2},I)={n_1 \choose x_1}{n_2 \choose x_2}\int_{0}^{1}p_{1}^{x_1}p_{2}^{x_2}(1-p_{1})^{n_1-x_1}(1-p_{2})^{n_{2}-x_{2}}\,dp_{1}\,dp_{2}$$
$$={n_1 \choose x_1}{n_2 \choose x_2}B(x_1+1,n_1-x_1+1)B(x_2+1,n_2-x_2+1)$$
And so taking ratios gives:
$$\frac{P(D|H_{1},I)}{P(D|H_{2},I)}=
\frac{B(x_1+x_2+1,n_1+n_2-x_1-x_2+1)}
{B(x_1+1,n_1-x_1+1)B(x_2+1,n_2-x_2+1)}
$$
$$=\frac{{x_1+x_2 \choose x_1}{n_1+n_2-x_1-x_2 \choose n_1-x_1}(n_1+1)(n_2+1)}{{n_1+n_2 \choose n_1}(n_1+n_2+1)}$$
And the choose functions can be calculated using the hypergeometric($r$,$n$,$R$,$N$) distribution where $N=n_1+n_2$, $R=x_1+x_2$, $n=n_1$, $r=x_1$
And this tells you how much the data support the hypothesis of equal probabilities, given that you don't have much information about which particular value this may be.
Under $H_0$, the MLEs are:
$$\hat{\mu}=\frac{n_1\bar{X}+n_2\bar{Y}}{n}, \qquad \hat{\sigma}^2_0=\frac{1}{n}\left( \sum_{i=1}^{n_1}{(X_i-\hat{\mu})^2} + \sum_{i=1}^{n_2}{(Y_i-\hat{\mu})^2} \right)$$
where $n=n_1+n_2$. Under $H_a$ we get:
$$\hat{\mu}_1=\bar{X},~\hat{\mu}_2=\bar{Y},\qquad\hat{\sigma}^2=\frac{1}{n}\left( \sum_{i=1}^{n_1}{(X_i-\bar{X})^2} + \sum_{i=1}^{n_2}{(Y_i-\bar{Y})^2} \right)$$
This is almost identical to your results.
Now, here is the trick (you can verify it easily):
$$\sum_{i=1}^{n_1}{(X_i-\hat{\mu})^2}=\sum_{i=1}^{n_1}{(X_i-\bar{X}+\bar{X}-\hat{\mu})^2}=\sum_{i=1}^{n_1}{(X_i-\bar{X})}+n_1(\bar{X}-\hat{\mu})^2$$
The same applies for the $Y_i$'s:
$$\sum_{i=1}^{n_2}{(Y_i-\hat{\mu})^2}=\sum_{i=1}^{n_2}{(Y_i-\bar{Y})}+n_2(\bar{Y}-\hat{\mu})^2$$
Using $\hat{\mu}=\frac{n_1\bar{X}+n_2\bar{Y}}{n}$, we get that
$$\bar{X}-\hat{\mu}=\bar{X}-\frac{n\bar{X}-n_1\bar{X}-n_2\bar{Y}}{n}=\frac{n_2}{n}(\bar{X}-\bar{Y})$$ and similarly $$\bar{Y}-\hat{\mu}=-\frac{n_1}{n}(\bar{X}-\bar{Y})$$
$$\hat{\sigma}_0^2=\frac{1}{n}\left( \sum_{i=1}^{n_1}{(X_i-\hat{\mu})^2} + \sum_{i=1}^{n_2}{(Y_i-\hat{\mu})^2} \right)=\frac{1}{n}\left( \sum_{i=1}^{n_1}{(X_i-\bar{X})}+ \sum_{i=1}^{n_2}{(Y_i-\bar{Y})} + n_1(\bar{X}-\hat{\mu})^2 + n_2(\bar{Y}-\hat{\mu})^2 \right)=\frac{1}{n}\left( \sum_{i=1}^{n_1}{(X_i-\bar{X})}+ \sum_{i=1}^{n_2}{(Y_i-\bar{Y})} + \frac{n_1n_2^2}{n^2}(\bar{X}-\bar{Y})^2 + \frac{n_1^2n_2}{n^2}(\bar{X}-\bar{Y})^2 \right)=\frac{1}{n}\left( \sum_{i=1}^{n_1}{(X_i-\bar{X})}+ \sum_{i=1}^{n_2}{(Y_i-\bar{Y})} + \frac{n_1n_2}{n}(\bar{X}-\bar{Y})^2 \right)$$
Now, the likelihood ratio is:
$$\lambda=\left(\frac{\hat{\sigma}^2}{\hat{\sigma}^2_0}\right)^{\frac{n}{2}}\le k,$$ which is equivalent to $$\left(\frac{\hat{\sigma}^2_0}{\hat{\sigma}^2}\right)^{\frac{-n}{2}}\le k,$$ and this is equivalent to $$\frac{\hat{\sigma}^2_0}{\hat{\sigma}^2} \le k'.$$
The last fraction is:
$$\frac{\hat{\sigma}^2_0}{\hat{\sigma}^2}=\frac{\frac{1}{n}\left( \sum_{i=1}^{n_1}{(X_i-\bar{X})}+ \sum_{i=1}^{n_2}{(Y_i-\bar{Y})} + \frac{n_1n_2}{n}(\bar{X}-\bar{Y})^2 \right)}{\frac{1}{n}\left( \sum_{i=1}^{n_1}{(X_i-\bar{X})^2} + \sum_{i=1}^{n_2}{(Y_i-\bar{Y})^2} \right)}$$
$$=1+\frac{n_1n_2}{n}\frac{(\bar{X}-\bar{Y})^2}{\sum_{i=1}^{n_1}{(X_i-\bar{X})^2} + \sum_{i=1}^{n_2}{(Y_i-\bar{Y})^2}}.$$
Substracting $1$ from both sides and dividing by $\frac{n_1n_2}{n}$, the LRT is now
$$\frac{(\bar{X}-\bar{Y})^2}{\sum_{i=1}^{n_1}{(X_i-\bar{X})^2} + \sum_{i=1}^{n_2}{(Y_i-\bar{Y})^2}} \le k''$$
This id equivalent to rejecting if
$$ \frac{\bar{X}-\bar{Y}}{\sqrt{\sum_{i=1}^{n_1}{(X_i-\bar{X})^2} + \sum_{i=1}^{n_2}{(Y_i-\bar{Y})^2}}} \le k'''$$
which has the form of a $t$-test. $\blacksquare$
Best Answer
It is implied the $X_i$ are independent of the $Y_j.$ Therefore the usual maximum likelihood equations apply to the $X_i$ and the $Y_j$ separately, with solutions
$$\begin{cases} \hat\lambda \hat \alpha\ n_1 &= \sum_{i=1}^{n_1}X_i &=x \\ \hat\lambda \hat \alpha^2n_2 &=\sum_{j=1}^{n_2}X_i &=y \end{cases}$$
yielding
$$\hat\alpha = \frac{y/n_2}{x/n_1}\tag{*}$$
provided $x \ne 0;$ that is, assuming at least one $X$ event was observed. Note that $\lambda$ needn't be known and that the equation for $\hat\alpha$ really reduces to a linear one, not a quadratic one.
Simulation bears out the correctness of this solution. Since MLE is an asymptotic procedure, we don't want to test the results for small $n_1,n_2.$ This example of applying $(*)$ to 100,000 independent datasets uses $n_1=24, n_2=9$ with $\alpha=\pi$ (plotted as a gray vertical line) and $\lambda=10.$ The average estimate is plotted as a red vertical line: that the two vertical lines are nearly coincident indicates any bias is low.
This is the
R
code used to produce the figure. NB In this simulation, no individual estimate $\hat \alpha$ was undefined. When the expectation of $x$ (namely, $\lambda \alpha n_1$) is small, the values of $x$ in some simulations can be zero.