# Probability Theory

## Objectives

This assignment is designed to:

• give you practice thinking about probabilistic models in the form of Bayes Nets (directed graphical models)
• help you become more fluent with the terminology and the techniques of the course
• prepare you to work with interesting graphical models of natural phenomena!

## Exercises

Show your work. Be clear and concise. This assignment must be typed.

1. [10 points] Prove that the relationship we call conditional independence is symmetric. In other words, Prove either (option a) or (option b) (since they are equivalent), and apply the same standard of proof as in assignment 1:
• (option a) $P(X | Y, Z) = P(X | Z)$ if and only if $P(Y | X, Z) = P(Y | Z)$
• (option b) $P(X, Y | Z) = P(X | Z) \cdot P(Y|Z)$ if and only if $P(Y, X | Z) = P(Y | Z) \cdot P(X | Z)$
• (in other words, the “given $Z$” stays the same, while $X$ and $Y$ trade places).
2. [20 points: 10 points each] (based on exercise 2.2 in Koller and Friedman) Independence:
• (2.1) Prove that for binary random variables $X$ and $Y$, the event-level independence $(x^0 \bot y^0)$ implies random-variable independence $(X \bot Y)$. Use the usual standard of proof.
• (2.2) Give a counterexample for nonbinary variables.
3. [20 points] Consider how to sample from a categorical distribution over four colors. Think of a spinner with four regions having probabilities $p_{red}$, $p_{green}$, $p_{yellow}$, and $p_{blue}$. Write pseudo-code for choosing a sample from this distribution.
• You may assume that you have access to a function that samples a uniform random variable with support [0,1].
4. [10 points] Does your pseudo-code scale to run efficiently on a distribution over ten thousand values? If not, rewrite it. If so, say why.
5. [20 points] Implement your pseudo-code, choose values for the four probabilities on the spinner as parameters to your procedure, and run it 100 times, keeping track of how many times each color shows up. Give the results as a vector of counts over the four colors.
6. [10 points] Normalize your count vector by 100. How does the result compare with your chosen parameters?

## Report 