[Math] Probability of multiple-choice answers in questions

probability

Assume a test in multiple-choice format taken by a student with no prior knowledge of the test subject and he is going to pick answers in a random way.

  1. What is the probability of him getting 100% in the test?
  2. What is the average probability of him getting atleast 40% in the test?
  3. Given N number of students take the test(assuming all pick answers at random), what is the average score?
  4. What effect do the variables number of questions, students and number of choices have in the overall average percentage of marks?
  5. Do the results considerably vary across tests?
  6. What would be a better strategy to get more marks – to pick answers at random or to select a single choice (such as A)

Assuming Q be the number of questions and Q, C be the number of choices and N the number of students and for this illustration let Q be 10, C be 3 and N be 1000, I arrived at the following

Question 1:

1/(C^Q). So 1/(3^10)

Question 2:

The probability of getting at least 40% is 1 minus probability of getting less than 4 questions correct. So 1-(1/(3^1+3^2+3^3))

Question 3:

1/C. I run a simulation program and got the results but can't mathematically prove or deduce it.

Question 4:

Deducing from 3 above, only C is going to have an impact on the average score irrespective of Q and N (assuming N to be greater)

Question 5:

Since only C is the major determinant of overall scores, the results aren't going to vary across tests

Question 6:

Randomly picking answers is better than picking a same choice. Again I ran a simulation and deduced it but cannot mathematically deduce it.

Am I right? How to mathematically deduce the answer for Question 3 and Question 6

Best Answer

Question 1

You're right, it is $1/C^Q$.

Question 3 By the law of large numbers, the average of the results obtained from a large number of trials should be close to the expected value. The expected value, from the other side, is obviously $\frac{1}{C}$ (as, for each question, there is $\frac{1}{C}$ probability of getting $1$ and $\frac{C-1}{C}$ probability of getting $0$).

Question 4 Changing $C$ directly changes the expected average score. The number of questions and students affects the dispersion of the "average score".

Question 5 Results may vary considerably across tests, although more the number of questions and students is, the less likely considerable differences are. For example, on the first test all students may accidentally guess all answers (which is possible, although unlikely); on the second test it is possible that none of the students will guess the right answer.

Question 6

As long as the correct answers are distributed uniformly, it doesn't matter which answer will you choose on a specific question, the probability of guessing is $\frac{1}{C}$. And, of course, the probability of getting the right answer for $n+1$-th question does not depend on what answer was chosen for $n$-th question. Both strategies are equivalent.

The results you got in your test run may be explained by e.g. non-uniform correct answers distributions (e.g. the second answer is correct for 50% of questions, while the first and the third answers are correct for 25% of questions) and checking against some unlucky answer (e.g. the first one). In such a case, choosing a random answer will give you 33.3% expected score, while always choosing the first answer will give you only 25% of expected score.