I'm trying to deduce from samples of observations from two finite sets of random variables $X_{1}, …, X_{n}$ and $Y_{1}, …, Y_{m}$ that the expected values of the average of those random variables are not the same. Let's denote $\frac{1}{n}\sum\limits_{i=1}^{n}{X_{i}}$ and $\frac{1}{m}\sum\limits_{i=1}^{m}{Y_{i}}$, $\bar{X}$ and $\bar{Y}$ respectively. So I'm trying to deduce from observations $x_{1}, …, x_{n}$ and $y_{1}, …, y_{m}$ that $E[\bar{X}] \neq E[\bar{Y}]$.

So my null hypothesis, $H_{0}$, would be $E[\bar{X}] = E[\bar{Y}]$ and the alternative one, $H_{1}$, would be $E[\bar{X}] \neq E[\bar{Y}]$. I thought of using Hoeffding's inequality this way:

If my random variables are all bounded and between $0$ and $1$ (which I have in practice), we then have: $P(|\bar{X}-\bar{Y}-(E[\bar{X}]-E[\bar{Y}])|\geq \varepsilon) \leq e^{-\frac{2\varepsilon^{2}}{\frac{1}{n}+\frac{1}{m}}}$

Under the null hypothesis we have then : $P(|\bar{X}-\bar{Y})|\geq \varepsilon) \leq 2e^{-\frac{2\varepsilon^{2}}{\frac{1}{n}+\frac{1}{m}}}$, we can fix the type $I$ error rate to a maximum of $\delta$ if we take $\varepsilon_{\delta} = \sqrt{\frac{1}{2(\frac{1}{n}+\frac{1}{m})}ln(\frac{2}{\delta})}$.

Now, suppose we get observations such as $|\bar{X}-\bar{Y}| < \varepsilon_{\delta}$, there's a probability of $P(|\bar{X}-\bar{Y}| < \varepsilon_{\delta}) = 1 – P(|\bar{X}-\bar{Y}| \geq \varepsilon_{\delta})$ under $H_{0}$, so our type $II$ error is at most $1 – \delta$.

I hope you can help me understand if what I'm doing is correct or not. Thank you in advance.

## Best Answer

Your $\epsilon$

increaseswith $m$ and $n$ for fixed $\delta$, so the test becomes less powerful with more data. There must either be an error in the calculation or in the idea.Starting with $$P(|\bar X-\bar Y|>\epsilon)\leq 2\exp\left( \frac{-2\epsilon^2}{\frac{1}{n}+\frac{1}{m}}\right)$$ we're going to write $\alpha$ for the right-hand side, because we'll need $\delta$ for more standard purposes late. I'm also going to write $\tilde n$ for $1/(1/n)+(1/m)$ to cut down on the opportunities for getting the reciprocals wrong (and cut down on the typesetting). We want to guarantee the LHS is less than $\delta$, so we want $$\alpha \leq 2\exp\left( -2\tilde n\epsilon^2\right)$$ Rearranging: $$\frac{\alpha}{2}\leq exp\left( -2\tilde n\epsilon^2\right)$$ $$\log \frac{\alpha}{2}\leq -2\tilde n\epsilon^2$$ $$\frac{1}{2\tilde n}\log \frac{\alpha}{2}\leq -\epsilon^2$$ $$-\frac{1}{2\tilde n}\log \frac{\alpha}{2}\geq \epsilon^2$$ $$\sqrt{-\frac{1}{2\tilde n}\log \frac{\alpha}{2}}\geq \epsilon$$ $$\sqrt{\frac{1}{2\tilde n}\log \frac{2}{\alpha}}\geq \epsilon$$ which is like your bound except it decreases with increasing sample size.

Now, the other thing that's wrong is that you haven't specified a distance of departure from the null in getting your power -- the power must decrease to $\delta$ as $E[X]-E[Y]\to 0$, so you need to specify what $|E[X]-E[Y]|$ is under the alternative for which you want the specified power. Without loss of generality, suppose $X$ has the larger mean and write $\delta=E[X]-E[Y]$. Also write $Z$ for $X-\delta$, so that the means of $Z$ and $Y$ are the same. Now, $\bar Z-\bar Y$ is controlled by Hoeffding's Theorem, since $Z$ is still bounded, and if $$P[|\bar X-\bar Y| >\delta/2]\leq \alpha$$ under the null and $$P[|\bar Z-\bar Y| >\delta/2]\leq \alpha$$ under the alternative, then you have a useful test, since at most one of $|\bar X-\bar Y|>\delta/2$ and $|\bar Z-\bar Y|>\delta/2$ can be true