A Bayes estimator combines information from a prior parameter estimate P(i) and a likelihood parameter estimate P(R | i) to arrive at a posterior parameter estimate P(i | R). In the Bayes parameter estimation formula below, R stands for "results" and stands for "parameter":
P(i | R) = P(R | i) P(i) / P(R)
In the specific case of a simple binary survey, the sample results can be expressed as the number of success events k divided by the total number of events n:
R = k/n
The Bayes parameter estimation formula for poll data looks like this:
P(i | k/n) = P(k/n | i) * P(i) / P(k/n)
Recall that the numerator term P(k/n) plays a relatively insignificant normalizing role, so you can ignore it for the purposes of understanding how to compute the posterior distribution:
P(i | k/n) ~ P(k/n | i) * P(i)
In the last few sections, I have shown you how the likelihood term P(k/n | i) in the above formula can be computed using maximum likelihood techniques -- in particular, the binomial formula for computing the probability of various values of i (where p is replaced by the generic term denoting a parameter ):
P(k/n | i) = nCk * ik * (1 - i) (n - k)
Now that you know how to compute the likelihood term in Bayes equation, how can you compute the prior term P(i)?
The key to computing P(i) is to first recognize that i represents the probability of a success event (like a 1-coded response) and as such, can only take on values in the 0 to 1 range. Each value of i in this range will have a different probability of occurrence associated with it. The parameter i can assume an infinite number of values between 0 and 1 which means that you need to represent it with a continuous probability distribution (like the normal distribution) as opposed to a discrete probability distribution (like the binomial distribution).
In the case of a simple binary survey, the beta distribution is the appropriate continuous distribution to use to represent P(i) because:
- The domain of your probability distribution function is between 0 and 1, and
- The outcomes of your survey arise from a Bernoulli process.
A Bernoulli process:
consists of a series of independent, dichotomous trials where the possible events occurring on each trial are labeled "success" and "failure", p is the probability of success on a given trial, and p remains unchanged from trial to trial. -- Winkler and Hayes, Statistics: Probability, Inference and Decision, p 204.
The process that generates the observed response distribution for a particular binary question in the survey can be legitimately viewed as arising from a Bernoulli process as Winkler and Hayes defined. A process that can be modeled as a Bernoulli process gives rise to a Beta distribution for the parameter p (estimated using k/n). I'm ready now to discuss the beta distribution and the critical role it plays in computing the posterior parameter estimate P(i | R).