# Lecture 9: expectation

• independent RVs
• expectation, linearity of expectation, variance
• review exercises:
• prove any of the claims in these notes
• constants are independent of everything
• no non-constant random variable is independent from itself
• variance of the sum of independent random variables is the sum of the variances

## Equivalent definitions of expectation

Last lecture we gave two defintions of expectation.

Definition 1:

Definition 2:

Claim: these two definitions are equivalent

Proof: We can group together terms in the first sum having the same value of :

We then apply the third Kolmogorov axiom, using the fact that the events partition :

## Aside: the expectation function; function spaces

Note that by itself is a function; it takes in random variables and gives back numbers. So the domain of is the set of all functions with domain and codomain .

Notation: In general, the set of functions with domain and codomain is written .

Therefore, .

## Linearity of expectation

Claim: If , are RVs, then .

Proof: We compute:

&= \sum_{s} (X+Y)(s)Pr(\{s\}) && \text{by definition of \lt math\gt E

[/math]} \\

&= \sum_{s} (X(s) + Y(s))Pr(\{s\}) && \text{by definition of } \\
&= \sum_{s} X(s)Pr(\{s\}) + \sum_{s} Y(s)Pr(\{s\}) && \text{algebra} \\
&= E(X) + E(Y) && \text{by definition of } \\


\end{aligned} [/itex]

Fact: If is a constant RV with value (that is, for all ) then

Fact: If is a constant RV with value , then .

Proofs: left as review exercises.

Note: We usually don't make the distinction between the number and the random variable ; so the above are often written and .

Note: The fact that and are summarized by saying that "expectation is linear".

## Independent random variables; expectation of the product.

It is not generally the case that . For example, imagine a single fair coin flip, and let be the indicator variable for the flip being heads. That is, , , and .

We see . Moreover, , because and .

Thus but .

However, we have the following:

Definition: Two random variables and are independent if the events and are independent for all and .

Claim: If and are independent, then .

Proof: Well,

&= \left(\sum_{x} xPr(X = x)\right)\left(\sum_{y} yPr(Y = y)\right) \\
&= \sum_{x,y} xyPr(X=x)Pr(Y=y) \\
&= \sum_{x,y} xyPr(X=x \cap Y=y) && \text{since \lt math\gt X

[/math] and are independent} \\

&= \sum_{z} \sum_{x,y~with~xy=z} xyPr(X = x \cap Y = y) && \text{grouping terms} \\
&= \sum_{z} z\sum_{x,y~with~xy=z} Pr(X = x \cap Y = y) \\


\end{aligned} [/itex] Now, the union of the events over all and with is just the event . Moreover, these are disjoint, so we have Plugging this in gives by defintion.

## Variance

Variance is a measure of how spread out a distribution is. You might ask "how far are the samples from the mean, on average?". This suggests finding the expectation of the random variable (this is the RV describing the distance from the expected value). Unfortunately, (exercise), because can be positive or negative. We could imagine taking the absolute value, but it turns out to have nicer properties if we square it instead. This gives the definition of variance:

Definition: For a random variable , .

If is measured in a unit (such as inches) then the variance is measured in units squared (e.g. inches squared). Thus, it is often more useful to work with the square root of the variance, which is called the standard deviation:

Definition: the standard deviation of is just .

The following formula for the variance is often easier to compute in practice:

Claim: .

Proof: Note that random variables satisfy the normal rules of arithmetic. For example, . This is because they are evaluated pointwise. For example, we can show as follows:

Using this, the proof of the claim is just algebra:

&= E((X - E(X))^2) \\
&= E(X^2 - 2XE(X) + E(X)^2) \\
&= E(X^2) - 2E(XE(X)) + E(E(X)^2) && \text{by linearity of expectation} \\
&= E(X^2) - 2E(X)^2 + E(X)^2 && \text{because \lt math\gt E(X)

[/math] and are constants} \\

&= E(X^2) - E(X)^2


\end{aligned} [/itex]