Bounding probabilities of a random variable with Markov and Chebyshev and Hoeffding

probabilityprobability distributionsupper-lower-bounds

I am trying to get a better understadning of Markov, Chebyshev and Hoeffding's bounds. I am therefore looking into a problem to give me the understanding.

In the problem I am facing 15 assignments are graded on a 0-100 scale. And where it is assumed that each assignment is an independent sample of the authors knowledge of the material and that all scores are sampled from the same distribution. $X_{1}, \ldots, X_{15}$ denote the scores and $\hat{Z}=\frac{1}{15} \sum_{i=1}^{15} X_{i}$ their average. $p$ denotes the unknown expected score, so that $\mathbb{E}\left[X_{i}\right]=p$ for all $i$.

And I am to find the maximal value $z$,such that the probability of observing $\hat{Z} \leq z$ when $p=60$ is at most $\delta=0.05$ for the three types of bound.

EDIT: I've gotten an answer to markov part. The Chebyshev and the Hoeffding is still partly unanswered

For the Markov bound I've used the hint that $\hat{Q} = 100 – \hat{Z} = 40$
$P(\hat{Z} \geq z) \leq z = \frac{\frac{1}{15}*40}{0.05} = 53,3$

Chebyshev
For Chebyshev it is suggested that you can use the fact that for a random variable $X \in[a, b]$ and a random variable $Y \in\{a, b\}$ with $\mathbb{E}[X]=\mathbb{E}[Y]$ we have $\operatorname{Var}[X] \leq \operatorname{Var}[Y]$.
Likewise it is suggested that one determines what should be the values of $\mathbb{P}(Y=a)$ and $\mathbb{P}(Y=b)$ in order to get the right expectation and afterwards that one obtain a bound on the variance.

I am however not sure how $\mathbb{P}(Y=a)$ and $\mathbb{P}(Y=b)$ relates to variance in the Chebyshev inequality.
Could you perhaps guide me in some direction w.r.t. getting clearer about how to find $a$ and $b$. Thanks a lot.

I am trying to understand how the Chebyshev bound can be calculated once, I've found the values for P(Y = a) and P(Y = b). More specificially I have some problems understanding the relationship between $P(\hat{Z} \geq z)$ and how it relates to Y
In the comments it was suggested to use
$V\left( \frac{1}{15} \sum X_{i}\right)=\frac{1}{15^{2}} V\left(\sum X_{i}\right)=\frac{1}{15} X_{i}$ for Chebyshev.

In order to re-write $P(\hat{Z} \geq z)$ so that we can consider the variable,
I've rewritten both-hand sides of the expression.

$P(\hat{Z} \geq z) =
P(\frac{1}{15} \sum_{i=1}^{15} X_{i} \geq z) =
P(\frac{1}{15} \sum_{i=1}^{15} X_{i} – E[\hat{Z}] \geq z – E[\hat{Z}])=$

$ 1 – P(\frac{1}{15} \sum_{i=1}^{15} X_{i} – E[\hat{Z}] \leq z – E[\hat{Z}]) \leq $

$1 – P(|\frac{1}{15} \sum_{i=1}^{15} X_{i} – E[\hat{\frac{1}{15}, \sum_{i=1}^{15} X_{i}}]| \leq z – E[\hat{\frac{1}{15} \sum_{i=1}^{15} X_{i}}]) \leq $

$1 – \frac{Var(\frac{1}{15} \sum_{i=1}^{15} X_{i})}{(z – E[\hat{\frac{1}{15} \sum_{i=1}^{15} X_{i}}])^2} \leq 1 – \frac{Var(Y)}{(z – E[Y])^2} = 0.05$

We now want to insert values for Y.

We have for the variance that $E[Y^2]−E[Y]^2=2400 $

And for the denominator that

$(z – (100* 0.6 + 0 *0.4))^2 = z^2 -180z + 3600$

We re-arange the equation:

$1 – \frac{Var(Y)}{(z – E[Y])^2} = 0.05 \iff$

$z^2 -180z + 3600 – 2400 = 0.05z^2 -9z +180 \iff$

$0,95z^2 -171z + 1020 = 0 \iff$

$z_1 = 173.8231$
$z_2 = 6.1769$

The max value of z is therefore 6,17. This is the case since $z_1$ is outside of the score interval from 0-100. And provides a mindless value for z.

Could this be right though?
In the assignment they want us to discuss:
"Which of the three inequalities provide a non-vacuous value of z? (You
know without any calculations that for any $z < 0$ we have $P(Z ≤z) = 0$,
so any bound smaller than 0 is useless.)
"

My value isn't smaller than 0, so I figure the reasoning here mentioned doesn't apply.

Hoeffdings
For the Hoeffdings bound I am a bit unsure of the relationship between the average $\hat{Z}$ and $X_i$. I've tried to rewrite the formula in a similar style to what I did in the Chebyshev case. Like this:

$P(\frac{1}{15} \sum_{i=1}^{15} X_{i} \geq z )=$

$P(\sum_{i=1}^{15} X_{i} \geq z * 15 ) =$

$P(\sum_{i=1}^{15} X_{i} – \mu \geq z * 15 – \mu ) =$

$P(\sum_{i=1}^{15} X_{i} – E[\hat{Z}] \geq z * 15 – E[\hat{Z}]) =$

$1 – P(\sum_{i=1}^{15} X_{i} – E[\hat{Z}] \leq z * 15 – E[\hat{Z}]) \leq$

$ 1 – e^{-2*15*(z*15 – 60)^2} = 0,05 \iff$

$ln(1) -30 * (225z – 3600) = ln(0,05) \iff$

$-6750z + 108000 = ln(0,05) – 108000 \iff$

$-6750z = ln(0,05) – 108000 \iff$

$-6750z = -ln(0,05) + 3600 \iff$

$z = (-ln(0,05) + 3600)) / 450 = 16,01$

Because of the result I am however unsure if the way I rewrite it is legitimate, considering that the value of the left hand side within the $P(\hat{Z} \geq z)$ doesn't mean a change in the bound as such. Furthermore I am not sure if I have understood the relationship between the average $\hat{Z}$ and the r.v $X_i$ correctly in Hoeffdings formula.

Best Answer

Let $Y$ be a random variable that equals 100 with probability 0.6 and 0 with probability 0.4. This random variable has an expectation of 60 and a variance of $E[Y^2]-E[Y]^2 = 2400$. According to the provided hint, the variance of the random variable $X$ corresponding to each $X_i$ is thus bounded by 2400.