# SP18:Lecture 32 Conditional probability

We introduce Conditional probability, which is a tool for stating and interpreting facts about a probability space. We give definitions and prove some basic results.

# Conditional probability

The conditional probability of given is the probability of , "scaled up" so that the probability of given is 1.

Formally:

If and are events, then the conditional probability of given (written ) is given by

For example, suppose we wish to model the following experiment: we first select one of two coins. The first coin (coin a) is weighted: it lands heads 3/4 of the time. The second coin (coin b) is fair: it lands heads 1/2 of the time. We choose the first coin 1/3 of the time. We want to find the probability of getting heads.

How do we interpret the facts given in the problem?

We first construct a sample space: there are 4 things that can happen: we can choose coin a and flip heads, we can choose coin a and tails, we can choose coin b and flip heads, or we could choose coin b and flip tails. A reasonable sample space would be .

It is (always) helpful to define some events: let be the event that we pick coin a, and be the event that we flip heads; define and similarly.

Now we need to interpret the probabilities given in the problem. When we say "[coin a] lands heads 3/4 of the time", we don't mean that 3/4 of the time we choose coin a and flip it and get heads (this would be ). Rather, we mean that if we restrict our attention to the outcomes where we chose coin a, then the probability of getting heads in that restricted experiment is 3/4. Put more simply, the probability that we get heads given that we choose coin a is 3/4.

We interpret this in our model by setting . Since we choose coin a with probability 1/3, we see that : we would expect to select coin a and flip heads in about a quarter of the experiments.

Similarly, so .

Since we can only select one of the coins, the events and are disjoint, so we can use the third Kolmogorov axiom to compute :

# Probability trees

A useful way to organize information about events in a probability space is by drawing a probability tree. Here the branches corrsepond to events, and the edges are weighted by the corresponding conditional probabilities.

Consider the experiment where we choose coin a 1/3 of the time, and coin b 2/3 of the time, and where coin a lands heads 3/4 of the time and coin b lands heads 1/2 of the time.

We can draw a tree to organize these events into a tree:

The vertices in the tree represent events; the event is a child of if . The number on the edge from to is the conditional probability of given .

The probability of an event in the tree can be found by multiplying the probabilities on the path leading to that event. This comes from the definition of conditional probability: if then .

# Useful formulas

## Bayes' rule

Bayes' rule (also called Bayes' law or Bayes' identity) is a simple equation relating and :

Proof:
By definition, and . Multiplying by the denominators gives . Dividing by gives the result.

## Law of total probability

Often, we have several events that partition the sample space. For example, we may have events like "the die is even" (call this event ) and "the die is odd" (this event is ); one of the two must happen (so ) but they cannot both happen (so ).

In this case, there is an easy way to compute the probability of another event by considering it separately in the case and the case:

If , , , partition the sample space, then for any ,

Proof: Law of total probability
The proof is pretty clear from the following picture:

Since the are disjoint, we have that the sets are disjoint; and since every element of is in one of the , we have that every element of is in one of the .

Therefore, we can apply the third Kolmogorov axiom to conclude

as required.

## Medical test example

Suppose a patient takes a medical test to see if they have a rare disease. The disease is rare: only 1/10,000 people have it. The test has a very good false positive rate of 1% (that is, of the people who don't have the disease, 1% of them still test positive) and a false negative rate of 2% (of the people who do have the disease, 2% of them test negative).

If a patient takes the test and gets a positive result, what is the probability that they have the disease?

We can model this problem probabilistically. Let represent the event where the patient has the disease, and let be the event where the patient is healthy. Let be the event representing a positive test result, and let be the event that the test is negative.

We can interpret the facts from the problem:

• the disease is rare: (and therefore ).
• the false positive rate is 1%: .
• the false negative rate is 2%: .

We are interested in the probability that the patient has the disease, given that they tested positive. In other words, we want to find .

We can apply Bayes' rule and the law of total probability (since and partition the sample space):

We need , in other words, what is the probability that the test results are positive, given that someone has the disease. Intuitively, this should be 98% (1 - Pr(N|D)). And indeed it is. You can prove this using the fact that conditional probabilities satisfy Kolmogorov's axioms.

Plugging this in, we get

Perhaps this is surprising; you might expect that a positive result on a good test means you have the disease with high probability. And indeed, you have learned a great deal: your chances of having the disease went up by a factor of 10. However, because the disease is still rare, you are still not particularly likely to have it.

However, you might want to have further testing done; see the repeated medical test example.