SP18:Lecture 31 Probability intro

From CS2800 wiki
Revision as of 13:57, 24 April 2018 by {{GENDER:Mdg39|[math]'"2}} [/math]'"7
(<math>1) </math>2 | <math>3 (</math>4) | <math>5 (</math>6)

We'll start discussing probability, giving motivation and basic definitions.

Probability is a mathematical tool for estimating the "likelyhood" of various events. It can be used to reason about uncertainty or lack of knowlege, giving a framework for integrating different assumptions and observations.

When we speak of the probability of an event [math]E [/math], we are in the context of a probability space; the probability of [math]E [/math] is just [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(E) [/math] where [math]\href{/cs2800/wiki/index.php/Pr}{Pr} [/math] is the probability measure


Definitions

Sample space, event, outcome

Definition: Sample space
A sample space S is just a set. We refer to the elements of S as outcomes. We refer to the subsets of S as events.

Probability measure, Kolmogorov's axioms

A probability measure on a sample space S is a function [math]\href{/cs2800/wiki/index.php/Pr}{Pr} : \href{/cs2800/wiki/index.php/2}{2}^\href{/cs2800/wiki/index.php/S}{S} \href{/cs2800/wiki/index.php/%E2%86%92}{→} \href{/cs2800/wiki/index.php?title=%E2%84%9D&action=edit&redlink=1}{ℝ} [/math] satisfying the following three properties:
  1. For all events [math]E [/math], [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(E) ≥ 0 [/math]
  2. [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(S) = 1 [/math]
  3. For all disjoint events [math]E_1 [/math] and [math]E_2 [/math], [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(E_1 \href{/cs2800/wiki/index.php/%E2%88%AA}{∪} E_2) = \href{/cs2800/wiki/index.php/Pr}{Pr}(E_1) + \href{/cs2800/wiki/index.php/Pr}{Pr}(E_2) [/math]
These three properties are referred to as the Kolmogorov axioms.

Probability space

Definition: Probability space
A probability space is just a pair containing a sample space S and a probability measure [math]\href{/cs2800/wiki/index.php/Pr}{Pr} [/math] on S.

Facts about probability

Everything else about probability follows from the Kolmogorov axioms. For example:

If Pr is a probability measure on S, then for all events [math]E [/math], [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(S \href{/cs2800/wiki/index.php/%E2%88%96}{∖} E) = 1 - \href{/cs2800/wiki/index.php/Pr}{Pr}(E) [/math]
Proof:
We know that [math]\href{/cs2800/wiki/index.php?title=Claim:A_%E2%88%A9_(B_%E2%88%96_A)_%3D_%E2%88%85&action=edit&redlink=1}{E ∩ (S ∖ E) = ∅} [/math]. By the third Kolmogorov axiom, this means that


[math]\begin{align*} \href{/cs2800/wiki/index.php/Pr}{Pr}(E) + \href{/cs2800/wiki/index.php/Pr}{Pr}(\href{/cs2800/wiki/index.php/S}{S} \href{/cs2800/wiki/index.php/%E2%88%96}{∖} E) &= \href{/cs2800/wiki/index.php/Pr}{Pr}(E \href{/cs2800/wiki/index.php/%E2%88%AA}{∪} (\href{/cs2800/wiki/index.php/S}{S} \href{/cs2800/wiki/index.php/%E2%88%96}{∖} E)) && \href{/cs2800/wiki/index.php/Kolmogorov_axiom}{\text{by the third Kolmogorov axiom}} \\ &= \href{/cs2800/wiki/index.php/Pr}{Pr}(S) && \href{/cs2800/wiki/index.php?title=Claim:If_A_%E2%8A%86_B_then_B_%3D_A_%E2%88%AA_(B_%E2%88%96_A)&action=edit&redlink=1}{\text{since E ⊆ S}} \\ &= 1 && \href{/cs2800/wiki/index.php/Kolmogorov_axiom}{\text{by the second Kolmogorov axiom}} \end{align*} [/math]


Rearranging this equation gives the result.

In particular, we have:

If Pr is a probability measure, then for all events [math]E [/math], [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(E) ≤ 1 [/math]
Proof:
We have


[math]\begin{align*} \href{/cs2800/wiki/index.php/Pr}{Pr}(E) &= \href{/cs2800/wiki/index.php/Pr}{Pr}(\href{/cs2800/wiki/index.php/S}{S} \href{/cs2800/wiki/index.php/%E2%88%96}{∖} (\href{/cs2800/wiki/index.php/S}{S} \href{/cs2800/wiki/index.php/%E2%88%96}{∖} E)) && \href{/cs2800/wiki/index.php?title=Claim:If_A_%E2%8A%86_B_then_A_%3D_B_%E2%88%96_(B_%E2%88%96_A)&action=edit&redlink=1}{\text{since E = S ∖ (S ∖ E)}} \\ &= 1 - \href{/cs2800/wiki/index.php/Pr}{Pr}(\href{/cs2800/wiki/index.php/S}{S} \href{/cs2800/wiki/index.php/%E2%88%96}{∖} E) && \href{/cs2800/wiki/index.php/Claim:Pr(S_%E2%88%96_E)_%3D_1_-_Pr(E)}{\text{using the fact that Pr(S ∖ E) = 1 - Pr(E) with S ∖ E plugged in for E}} \\ &≤ 1 - 0 = 1 && \href{/cs2800/wiki/index.php/Kolmogorov_axiom}{\text{Since Pr(S ∖ E) ≥ 0}} \end{align*} [/math]

Examples

Example: six-sided die

Suppose we wished to model an experiment where a single fair die is rolled (unless specificied otherwise, I will assume that all dice are six-sided).

We could model this experiment with a sample space [math]\href{/cs2800/wiki/index.php/S}{S} = \href{/cs2800/wiki/index.php/Enumerated_set}{\{1,2,3,4,5,6\}} [/math]. The assumption that the die is fair means that [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(\{1\}) = \href{/cs2800/wiki/index.php/Pr}{Pr}(\{2\}) = \cdots = \href{/cs2800/wiki/index.php/Pr}{Pr}(\{6\}) [/math]. Using the second and third Kolmogorov axioms, we see that these probability are all [math]1/6 [/math].

Note that for a finite sample space, it suffices to give the probabilities of the simple events: the events containing only a single outcome. This is justified by the following claim:

If [math]\href{/cs2800/wiki/index.php/S}{S} [/math] is a finite sample space, and [math]P : \href{/cs2800/wiki/index.php/S}{S} → \href{/cs2800/wiki/index.php?title=%E2%84%9D&action=edit&redlink=1}{ℝ} [/math] is a function satisfying
  1. [math]P(s) ≥ 0 [/math] for all [math]s \href{/cs2800/wiki/index.php/%E2%88%88}{∈} \href{/cs2800/wiki/index.php/S}{S} [/math]
  2. [math]\sum_{s \href{/cs2800/wiki/index.php/%E2%88%88}{∈} \href{/cs2800/wiki/index.php/S}{S}} P(s) = 1 [/math]
then there is a unique probability measure [math]\href{/cs2800/wiki/index.php/Pr}{Pr} [/math] having [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(\{s\}) = P(s) [/math] for all [math]s [/math].

The proof is left as an exercise.

Note that by choosing this sample space, we are already ruling out the possibility that the die could land on a corner or roll off the table; it is important to be aware that the choice of model can affect the conclusions drawn using it.

Equiprobable measures

The sample space for a six-sided die example had the property that every simple event had the same probability.

If this is the case, we say that the probability measure is equiprobable.

WARNING: Not every probability measure is equiprobable! Do not assume this unless you have a good reason!

With an equiprobable measure, one can compute the probability of any event [math]E [/math] by just counting the number of elements of [math]E [/math]:

An equiprobable measure [math]\href{/cs2800/wiki/index.php/Pr}{Pr} [/math] on a finite sample space [math]\href{/cs2800/wiki/index.php/S}{S} [/math] is the probability measure given by [math]\href{/cs2800/wiki/index.php/Pr}{Pr}(E) := \href{/cs2800/wiki/index.php?title=Finite_cardinality&action=edit&redlink=1}{ |E|/|S|} [/math].

Example: sum of two dice

Consider an experiment where two fair dice are rolled independently, and their values are added. We might model this experiment with the sample space [math]\href{/cs2800/wiki/index.php/S}{S} = \href{/cs2800/wiki/index.php/Enumerated_set}{\{2,3,\dots,12\}} [/math] (or even [math]\href{/cs2800/wiki/index.php/S}{S} = \href{/cs2800/wiki/index.php?title=%E2%84%9D&action=edit&redlink=1}{ℝ} [/math]), or with the sample space [math]\href{/cs2800/wiki/index.php/S}{S} = \href{/cs2800/wiki/index.php/Enumerated_set}{\{1,\dots,6\}} \href{/cs2800/wiki/index.php?title=%E2%A8%AF&action=edit&redlink=1}{⨯} \href{/cs2800/wiki/index.php/Enumerated_set}{\{1,\dots,6\}} = \href{/cs2800/wiki/index.php/Enumerated_set}{\{(1,1),(1,2),\dots,(2,1),\dots,(6,6)\}} [/math].

Either sample space would work, since both contain enough information to describe the outcomes of the experiment. However, it is difficult to describe the probability measure with the first model, while it is easy with the second (for the second, the fact that the dice are fair and independent means that the equiprobable measure is a good probability measure to describe the experiment).

There is not a "correct" sample space for a given problem, but there are some that are easier to work with than others. It is also possible to create a sample space that doesn't have enough resolution to interpret the events of interest: for example the sample space [math]\href{/cs2800/wiki/index.php/S}{S} = \{x\} [/math] wouldn't work for this experiment, since there is no way to interpret the sum of the dice.

Example: height of random person

Suppose we wished to model the following experiment: select a person from the room (uniformly), and measure their height.

One possible sample space for this experiment would be the set of possible heights (or even ). Another possible sample space would be the set of people in the room.

The former choice is more difficult to set up, because figuring out the probability of selecting a given height requires knowlege of the heights of the people in the room; choosing the latter is easier because one can use an equiprobable measure.