# The Beta and Dirichlet Distributions

## Objectives

This assignment is designed to:

• provide practice with essential ideas for building discrete Bayesian models
• help you become more fluent with the terminology and the techniques

## Instructions

This is a mathematical homework assignment. Show your work. Be clear and concise. Type your assignment.

Work through these exercises as soon as possible for their instructional content. If you have general questions about the assignment, please post on the Google Group. Finish early, and earn the early bonus.

## Exercises

### Question 1:

[20 points]

Use your favorite tool (e.g., Wolfram Alpha, ggplot2, matplotlib, …) to plot the Beta distributions Beta(1,1), Beta(0.1,0.1), Beta(2,2), Beta(4,2), Beta(2,4), Beta(0.1,2), Beta(2,0.1). Include the plots in your report. Describe the patterns you see.

### Question 2:

[25 points]

(a) Imagine you want to model a coin, but you haven't seen any data yet. You are told that the coin is unfair, but you're not told how. What would be a reasonable prior density to describe your uncertainty about the coin? Choose from the options listed in question #1, and justify your choice.

(b) Now assume that you observe 100 samples from the set {Heads, Tails}. The counts of the samples are (33, 67), respectively. We wish to model the source of this sequence, so we choose a binomial distribution. With this data and your chosen prior from part (a), estimate the parameters of the bernoulli distribution for the coin in the four following ways. In each case, represent the answer symbolically and then compute the numerical answer. mark each response clearly.

### Question 3:

[20 points]

Use your favorite tool (e.g., ggplot2, matplotlib, …) to plot the Dirichlet distributions Dirichlet(1,1,1), Dirichlet(0.1,0.1,0.1), Dirichlet(2,2,2), Dirichlet(4,2,2), Dirichlet(0.1,0.1,2). We provide an example here of a couple of ways to plot a 3D Dirichlet. Include the plots in your report. Describe the patterns you see.

### Question 4:

[25 points]

(a) Imagine you want to model a spinner with three color fields, but you haven't seen any data yet. You are told that the spinner is imbalanced, but you're not told how. What would be a reasonable prior density to describe your uncertainty about the spinner? Choose from the options listed in question #1, and justify your choice.

(b) Now assume that you observe 100 samples from the set {Red, Green, Blue}. The counts of the samples are (40, 25, 35), respectively. We wish to model the source of this sequence, so we choose a multinomial distribution. With this data and your chosen Dirichlet prior from part (a), estimate the parameters of the categorical distribution for the spinner in the four following ways. In each case, represent the answer symbolically and then compute the numerical answer. mark each response clearly.

### Question 5:

[10 points]

Explain how your answers to question #4, part (b) are relevant to your earlier work on classifying with the Naive Bayes model.

## Submission

Submit a .pdf document through Learning Suite. 