Implement Bayesian Inference Using PHP: Part 2 Bayes estimators

Implement Bayesian inference using PHP: Part 2

By Paul Meagher - 2004-05-19 Page: 1 2 3 4 5 6 7 8 9 10

Bayes estimators

A Bayes estimator combines information from a prior parameter estimate P(_i) and a likelihood parameter estimate P(R | _i) to arrive at a posterior parameter estimate P(_i | R). In the Bayes parameter estimation formula below, R stands for "results" and stands for "parameter":

P(_i | R) = P(R | _i) P(_i) / P(R)

In the specific case of a simple binary survey, the sample results can be expressed as the number of success events k divided by the total number of events n:

R = k/n

The Bayes parameter estimation formula for poll data looks like this:

P(_i | k/n) = P(k/n | _i) * P(_i) / P(k/n)

Recall that the numerator term P(k/n) plays a relatively insignificant normalizing role, so you can ignore it for the purposes of understanding how to compute the posterior distribution:

P(_i | k/n) ~ P(k/n | _i) * P(_i)

In the last few sections, I have shown you how the likelihood term P(k/n | _i) in the above formula can be computed using maximum likelihood techniques -- in particular, the binomial formula for computing the probability of various values of _i (where p is replaced by the generic term denoting a parameter ):

P(k/n | _i) = _nC_k * _i^k * (1 - _i) ^{(n - k)}

Now that you know how to compute the likelihood term in Bayes equation, how can you compute the prior term P(_i)?

The key to computing P(_i) is to first recognize that _i represents the probability of a success event (like a 1-coded response) and as such, can only take on values in the 0 to 1 range. Each value of _i in this range will have a different probability of occurrence associated with it. The parameter _i can assume an infinite number of values between 0 and 1 which means that you need to represent it with a continuous probability distribution (like the normal distribution) as opposed to a discrete probability distribution (like the binomial distribution).

In the case of a simple binary survey, the beta distribution is the appropriate continuous distribution to use to represent P(_i) because:

The domain of your probability distribution function is between 0 and 1, and
The outcomes of your survey arise from a Bernoulli process.

A Bernoulli process:

consists of a series of independent, dichotomous trials where the possible events occurring on each trial are labeled "success" and "failure", p is the probability of success on a given trial, and p remains unchanged from trial to trial. -- Winkler and Hayes, Statistics: Probability, Inference and Decision, p 204.

The process that generates the observed response distribution for a particular binary question in the survey can be legitimately viewed as arising from a Bernoulli process as Winkler and Hayes defined. A process that can be modeled as a Bernoulli process gives rise to a Beta distribution for the parameter p (estimated using k/n). I'm ready now to discuss the beta distribution and the critical role it plays in computing the posterior parameter estimate P(_i | R).

View Implement Bayesian inference using PHP: Part 2 Discussion

Page: 1 2 3 4 5 6 7 8 9 10 Next Page: Beta distribution sampling model

First published by IBM developerWorks