Example:Repeated medical test

From CS2800 wiki

Consider the medical test example. We saw that under some reasonable assumptions, the probability that someone has a disease given that the test is positive was 0.1%.

Perhaps this is not a high enough risk to justify performing an invasive procedure. Can we increase our confidence by taking a second test?

In our model, we defined various events: [math]D [/math] is the event that we have the disease, while [math]H [/math] is the event that we are healthy.

Let us use [math]P_1 [/math] to indicate the event that the first test is positive, and [math]N_1 [/math] to indicate the event that the first test is negative; define [math]P_2 [/math] and [math]N_2 [/math] similarly.

We were given in the problem that the false positive rate is 1%; this means that [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(P_1 \href{/cs2800/wiki/index.php/%5Cmid}{\mid} H) = 1/100 [/math] and [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(P_2 \href{/cs2800/wiki/index.php/%5Cmid}{\mid} H) = 1/100 [/math]. Similarly, we have that the false negative rate is 2%; this means that [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(N_1 \href{/cs2800/wiki/index.php/%5Cmid}{\mid} D) = \href{/cs2800/wiki/index.php/Pr}{Pr}(N_2 \href{/cs2800/wiki/index.php/%5Cmid}{\mid} D) = 2/100 [/math]. Finally, we were given that [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(D) = 1/10000 [/math].

Using these facts, we were able to compute that [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(D \href{/cs2800/wiki/index.php/%5Cmid}{\mid} P_1) \approx 1/1000 [/math]; the same computation shows that [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(D \href{/cs2800/wiki/index.php/%5Cmid}{\mid} P_2) \approx 1/1000 [/math].

Now, suppose both tests come back positive. What is the probability that we have the disease?

We want to compute [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(D \href{/cs2800/wiki/index.php/%5Cmid}{\mid} P_1 \href{/cs2800/wiki/index.php/%5Ccap}{\cap} P_2) [/math].

We can organize this information into a probability tree:

Repeated-medical-test-tree.svg

However, we don't know what [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(P_2 \href{/cs2800/wiki/index.php/%5Cmid}{\mid} H \href{/cs2800/wiki/index.php/%E2%88%A9}{∩} P_1) [/math] is. And this is sensible: depending on how the test works and what causes false positives, this probability could be anything:

  • Perhaps the test gives a false positive if the patient has a genetic anomaly (which 1% of the population has). In this case, rerunning the test will give exactly the same result, so [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(P_2 \href{/cs2800/wiki/index.php/%5Cmid}{\mid} H \href{/cs2800/wiki/index.php/%E2%88%A9}{∩} P_1) = 1 [/math]. Using this, we would find that [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(D \href{/cs2800/wiki/index.php/%5Cmid}{\mid} P_1 \href{/cs2800/wiki/index.php/%E2%88%A9}{∩} P_2) = \href{/cs2800/wiki/index.php/Pr}{Pr}(D \href{/cs2800/wiki/index.php/%5Cmid}{\mid} P_1) \approx 1/1000 [/math].
  • Perhaps the test gives a false positive because the lab technician dropped one of the 100 samples that they were testing and caused an incorrect result. In this case, a second run of the test cannot possibly fail, because there is only one incorrect test; therefore [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(P_2 \href{/cs2800/wiki/index.php/%5Cmid}{\mid} H \href{/cs2800/wiki/index.php/%E2%88%A9}{∩} P_2) = 0 [/math]. Using this assumption, we would find that [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(D \href{/cs2800/wiki/index.php/%5Cmid}{\mid} P_1 \href{/cs2800/wiki/index.php/%E2%88%A9}{∩} P_2) = 1 [/math].
  • Perhaps different iterations of the test fail independently. In this case, The false positive on the first test doesn't change the probability that the second test is a false positive. In this case, we have [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(P_2 \href{/cs2800/wiki/index.php/%5Cmid}{\mid} H \href{/cs2800/wiki/index.php/%E2%88%A9}{∩} P_1) = \href{/cs2800/wiki/index.php/Pr}{Pr}(P_2 \href{/cs2800/wiki/index.php/%5Cmid}{\mid} H) = 1% [/math]; using the assumption that the first and second tests are independent, we can compute [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(H \href{/cs2800/wiki/index.php/%5Cmid}{\mid} P_1 \href{/cs2800/wiki/index.php/%5Ccap}{\cap} P_2) [/math] using a probability tree (or using Bayes' rule and the law of total probability):

Repeated-medical-test-tree-independent.svg

Focusing on the [math]H [/math] branch, we see that [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(P_1 \href{/cs2800/wiki/index.php/%E2%88%A9}{∩} P_2 \href{/cs2800/wiki/index.php/%5Cmid}{\mid} H) = 10^{-4} [/math]. In the [math]D [/math] branch, we see that [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(P_1 \href{/cs2800/wiki/index.php/%E2%88%A9}{∩} P_2 \href{/cs2800/wiki/index.php/%5Cmid}{\mid} D) = 9604/10^8 \approx 10^{-5} [/math]. By the law of total probability, we have


[math]\href{/cs2800/wiki/index.php/Pr}{Pr}(P_1 \href{/cs2800/wiki/index.php/%E2%88%A9}{∩} P_2) = \href{/cs2800/wiki/index.php/Pr}{Pr}(P_1 \href{/cs2800/wiki/index.php/%E2%88%A9}{∩} P_2 \href{/cs2800/wiki/index.php/%5Cmid}{\mid} H)\href{/cs2800/wiki/index.php/Pr}{Pr}(H) + \href{/cs2800/wiki/index.php/Pr}{Pr}(P_1 \href{/cs2800/wiki/index.php/%E2%88%A9}{∩} P_2 \href{/cs2800/wiki/index.php/%5Cmid}{\mid} D)\href{/cs2800/wiki/index.php/Pr}{Pr}(D) \approx 10^{-4} [/math]


Using Bayes' rule, we have

[math]\href{/cs2800/wiki/index.php/Pr}{Pr}(D \href{/cs2800/wiki/index.php/%5Cmid}{\mid} P_1 \href{/cs2800/wiki/index.php/%E2%88%A9}{∩} P_2) = \frac{\href{/cs2800/wiki/index.php/Pr}{Pr}(P_1 \href{/cs2800/wiki/index.php/%E2%88%A9}{∩} P_2 \href{/cs2800/wiki/index.php/%5Cmid}{\mid} D)\href{/cs2800/wiki/index.php/Pr}{Pr}(D)}{\href{/cs2800/wiki/index.php/Pr}{Pr}(P_1 \href{/cs2800/wiki/index.php/%E2%88%A9}{∩} P_2)} \approx 10^{-5} / 10^{-4} = 1/10 [/math]