Yes, about 20% is the correct answer.
One way to check this is to work out the expected fractions of the total population that are:
- infected and test positive: 0.5% × 99% = 0.495%,
- infected and test negative: 0.5% × 1% = 0.005%,
- not infected and test positive: 99.5% × 2% = 1.98%, and
- not infected and test negative: 99.5% × 98% = 97.52%.
Thus, the fraction of the total population that test positive is 0.495% + 1.98% = 2.475%. Yet clearly, out of that approx. 2.5%, only about one fifth (i.e. approx. 0.5% of the total population) are actually infected.
One trick that can sometimes help to make sense of problems like this is to convert the fractions to actual numbers of individuals. So let's assume that our total population consists of 20,000 people (which is just large enough to make all the fractions work out to a whole number of people). Then:
- 100 out of these 20,000 people (0.5%) are infected.
- 99 of these 100 infected people (99%) test positive.
- 1 of these 100 infected people (1%) tests negative.
- 19,900 out of these 20,000 people (99.5%) are not infected.
- 398 of these 19,900 uninfected people (2%) test positive.
- 19,502 of these 19,900 uninfected people (98%) test negative.
Thus, the total number of people who test positive is 99 + 398 = 497. Out of these 497 people, 99 are actually infected, while 398 are false positives.
Yet another way to quickly figure out the approximate result is to note that almost all people (99.5% ≈ 100%) are uninfected, and almost all of the infected test positive (98% ≈ 100%).
Thus, the fraction of false positives in the full population is approximately equal to the given fraction of false positives among the uninfected (2% × 99.5% ≈ 2%), while the fraction of true positives is approximately equal to the fraction of infected (99% × 0.5% ≈ 0.5%). Thus, the rate of false positives in the population (≈ 2%) is about four times the rate of true positives (≈ 0.5%), and so only about one fifth of all positive test results are true.
Best Answer
Recheck your calculation. I get $$\Pr[V \mid S] = \frac{(0.04)(0.95)}{(0.04)(0.95) + (0.96)(0.01)} = \frac{95}{119} \approx 0.798319.$$
As for your other question, you want $$\Pr[V^c \mid S^c].$$ This is solved in an analogous fashion to what you already computed:
$$\Pr[V^c \mid S^c] = \frac{\Pr[S^c \mid V^c]\Pr[V^c]}{\Pr[S^c \mid V^c]\Pr[V^c] + \Pr[S^c \mid V]\Pr[V]}.$$ I leave the computation to you.
Another way to do these calculations is to construct a frequency table based on the given probabilities for a hypothetical cohort. Suppose the population contains $10000$ people. Of these, $(0.04)(10000) = 400$ are infected with the virus. The remaining $9600$ are healthy. Of the $400$ infected people, the test is $95\%$ reliable, so $(0.95)(400) = 380$ of these will test positive and $20$ will test negative. Of the $9600$ healthy people, the test is $99\%$ reliable, so $(0.99)(9600) = 9504$ of these will test negative, and $9600-9504 = 96$ will test positive. In summary
$$\begin{array}{|c|c|c|c|} \hline & V & V^c & \text{Total} \\ \hline S & 380 & 96 & 476 \\ \hline S^c & 20 & 9504 & 9524 \\ \hline \text{Total} & 400 & 9600 & 10000 \\ \hline \end{array}$$
We simply populated the corresponding joint events with the number of people we expect to meet the criteria, and the row totals $476$ and $9524$ were just the sums of the corresponding rows.
Now that we have constructed such a table, it is immediately obvious that $$\Pr[V \mid S] = \frac{380}{476} \approx 0.798319,$$ and the computation of $\Pr[V^c \mid S^c]$ is similarly performed by reading the appropriate cells in the table.