FA17 Lecture 8

From CS2800 wiki

Lecture 8: Random variables

  • Reading: Cameron [[../handouts/cameron_prob_notes.pdf#page=47|3.1–3.2, 3.4]], MCS [[../handouts/mcs.pdf#page=823|19.1]]
  • [[../../2016fa/lectures/2800probability.pdf|Last semester's notes]]
  • definitions: random variable, PMF, joint PMF, sum/product/etc of RVs, indicator variable, expectation

Random variables

Definition: A (real-valued) random variable [math]X [/math] is just a function [math]X : S → ℝ [/math].

Example: Suppose I roll a fair 6-sided die. On an even roll, I win $10. On an odd roll, I lose however much money is shown. We can model the experiment (rolling a die) using the sample space [math]S = \{1,2,3,4,5,6\} [/math] and an equiprobable measure. The result of the experiment is given by the random variable [math]X : S → \mathbb{R} [/math] given by [math]X(1) ::= -1 [/math], [math]X(2) ::= 10 [/math], [math]X(3) ::= -3 [/math], [math]X(4) ::= 10 [/math], [math]X(5) ::= -5 [/math], and [math]X(6) ::= 10 [/math].

Definition: Given a random variable [math]X [/math] and a real number [math]x [/math], the poorly-named event [math](X = x) [/math] is defined by [math](X = x) ::= \{k \in S \mid X(k) = x\} [/math].

This definition is useful because it allows to ask "what is the probability that [math]X = x [/math]?"

Definition: The probability mass function (PMF) of [math]X [/math] is the function [math]PMF_X : \mathbb{R} → \mathbb{R} [/math] given by [math]PMF_X(x) = Pr(X = x) [/math].

Combining random variables

Given random variables [math]X [/math] and [math]Y [/math] on a sample space [math]S [/math], we can combine apply any of the normal operations of real numbers on [math]X [/math] and [math]Y [/math] by performing them pointwise on the outputs of [math]X [/math] and [math]Y [/math]. For example, we can define [math]X + Y : S → \mathbb{R} [/math] by [math](X+Y)(k) ::= X(k) + Y(k) [/math]. Similarly, we can define [math]X^2 : S → \mathbb{R} [/math] by [math](X^2)(k) ::= \left(X(k)\right)^2 [/math].

We can also consider a real number [math]c [/math] as a random variable by defining [math]C : S → \mathbb{R} [/math] by [math]C(k) ::= c [/math]. We will use the same variable for both the constant random variable and for the number itself; it should be clear from context which we are referring to.

Indicator variables

We often want to count how many times something happens in an experiment.

Example: Suppose I flip a coin 100 times. The sample space would consist of sequences of 100 flips, and I might define the variable [math]N [/math] to be the number of heads. For example, [math]N(H,H,H,H,\dots,H) = 100 [/math], while [math]N(H,T,H,T,\dots) = 50 [/math].

A useful tool for counting is an indicator variable:

Definition: The indicator variable for an event [math]A [/math] is a variable having value 1 if the [math]A [/math] happens, and 0 otherwise.

The number of times something happens can be written as a sum of indicator variables.

In the coin example, we could define an indicator variable [math]I_1 [/math] which is 1 if the first coin is a head, and 0 otherwise (e.g. [math]I_1(H,H,H,\dots) = I_1(H,T,H,T,\dots) = 1 [/math]). We could define a variable [math]I_2 [/math] that only looks at the second toss, and so on. Then [math]N [/math] as defined above can be written as [math]N = \sum I_i [/math]. This is useful because (as we'll see when we talk about expectation) it is often easier to reason about a sum of simple variables (like [math]I_i [/math]) than it is to reason about a complex variable like [math]N [/math].

Joint PMF of two random variables

We can summarize the probability distribution of two random variables [math]X [/math] and [math]Y [/math] using a "joint PMF". The joint PMF of [math]X [/math] and [math]Y [/math] is a function from [math]\mathbb{R} \times \mathbb{R} → \mathbb{R} [/math] and gives for any [math]x [/math] and [math]y [/math], the probability that [math]X = x [/math] and [math]Y = y [/math]. It is often useful to draw a table:

[math]Pr [/math]











Note that the sum of the entries in the table must be one (Exercise: prove this). You can also check that summing the rows gives the PMF of [math]Y [/math], while summing the columns gives the PMF of [math]X [/math].


The "expected value" is an estimate of the "likely outcome" of a random variable. It is the weighted average of all of the possible values of the RV, weighted by the probability of seeing those outcomes. Formally:

Definition: The expected value of [math]X [/math], written [math]E(X) [/math] is given by [math]E(X) ::= \sum_{k \in S} X(k)Pr(\{k\}) [/math]

Claim: (alternate definition of [math]E(X) [/math]) [math]E(X) = \sum_{x \in \mathbb{R}} x\cdot Pr(X=x) [/math]

Proof sketch: this is just grouping together the terms in the original definition for the outcomes with the same [math]X [/math] value.

Note: You may be concerned about "[math]\sum_{x \in \mathbb{R}} [/math]. In discrete examples, [math]Pr(X = x) = 0 [/math] almost everywhere, so this sum reduces to a finite or at least countable sum. In non-discrete example, this summation can be replaced by an integral. Measure theory is a branch of mathematics that puts this distinction on firmer theoretical footing by replacing both the summation and the integral with the so-called "Lebesgue integral". In this course, we will simply use "[math]\sum [/math]" with the understanding that it becomes an integral when the random variable is continuous.