[Math] Mean and Variance of Correct Answers for various ways of Multiple Choice Questions

permutationsprobability

I picked this question out of the blue while thinking about multiple choice questions.

Consider I set, say $8$ multiple choice questions. Now, I want to find out, out of the three methods below, what is the mean and variance of correct answers the student will get correct, considering that the student selects an answer at random for each question

  1. Each question has $5$ possible answers and only $1$ of them is correct.
  2. There is a pool of $30$ possible answers, of which $8$ of them are the right answers to each question. The student is notified that no two questions share the same answer.
  3. There is a pool of $20$ possible answers. Each question has only one correct answer out of the pool of $20$. However, the student is notified that each answer could be the right answer to up to $2$ questions.

The first one is obviously binomial so the mean is $1.6$ and the variance is $1.28$.

For the second one, I am thinking of enumerating all possible permutations of $8$ answers, but am not sure how to proceed from there.

For the third one, I am not even sure how to begin.

Any ideas?

Best Answer

2.

Let's look at the number of ways that a student can get $x$ answers correct. Clearly, they can get all 8 answers correct in only 1 way. To determine the number of ways a student can get 7 correct you can choose the 7 they get, then choose 1 from the remaining 22: ${8\choose 7}{22\choose 1}$. For the number of ways for a student to get 6 correct it's ${8\choose 6}{22\choose 2}$, etc. So if $X$ is a random variable representing how many answers a student got right, $$p_X(x)={{8\choose x}{22\choose 8-x}\over{30\choose 8}}$$ Which is just the hypergeometric distribution with $K=8,N=30,$ and $n=8$. So the mean is $$n\frac{K}N=\frac{64}{30}$$and the variance is $$n\frac{K(N-K)(N-n)}{N^2(N-1)}=8\frac{8(30-8)(30-8)}{30^2(30-1)}=\frac{30976}{26100}$$


3.

There may be 4, 5, 6, 7, or 8 out of the 20 answers that are correct. I'm going to assume that, whatever the case, the student still gets 8 choices. The method I would use to determine the mean and variance of the number of correct responses made by a student would be to make a probability mass function for each of the five possible quiz designs (by which I mean quizzes with 8, 7, 6, 5, or 4 distinct correct responses). Let's call these functions $p_i(x)$, where $i$ is the number of distinct correct responses in the quiz. Then, assuming each of the 5 quiz designs is equally likely, the total probability mass function would be $$p_X(x)=\frac15p_8(x)+\frac15p_7(x)+\frac15p_6(x)+\frac15p_5(x)+\frac15p_4(x)$$

If the 5 quiz designs aren't equally likely you could of course adjust the weights on the $p_i(x)$'s to reflect that. With this function found you could find the mean and variance as you usually do for discrete random variables.

It looks like quite a task to completely find each of the $p_i(x)$'s, so instead of doing that I'll find a few of the values from a couple of them to illustrate the logic.$$\\$$

$\mathbf{p_8(x)}$: If there were 8 distinct correct answers, then this would be hypergeometrically distributed as above but with $N=20$.

$\mathbf{p_7(x)}$: If there were 7 distinct correct answers, then one of them would be the correct response to 2 of the questions. Let's call that one the 'double' response and the others 'single' responses. The number of ways of getting all 8 questions correct is the number of ways of choosing all 7 of these in 8 guesses, $$p_7(8)={{7\choose 7}{13\choose 1}\over {20\choose8}}$$A student can only score 7 out of 8 by getting the 'double' response and 5 of the others, so $$p_7(7)={{1\choose 1}{6\choose 5}{13\choose 2}\over{20\choose8}}$$A student can score 6 out of 8 by getting all six of 'singles' or by getting the 'double' and four of the 'single' answers, which means $$p_7(6)={{1\choose 0}{6\choose6}{13\choose2}+{1\choose1}{6\choose4}{13\choose3}\over{20\choose 8}}$$

$\mathbf{p_6(x)}$:If there were 6 distinct correct answers, then two of them would be 'double' solutions and four would be 'single' solutions. To get 4 out of the 8 questions correct a student could get none of the double solutions and all four of the singles, one double and two singles, or both doubles, so $$p_6(4)={{2\choose 0}{4\choose 4}{14\choose 4}+{2\choose 1}{4\choose 2}{14\choose 5}+{2\choose 2}{4\choose 0}{14\choose 6}\over{20\choose8}}$$

Similar logic should get you all the values for all the $p_i(x)$'s. And, as said above, with those found you would be able to calculate the mean and variance.