Implement Bayesian Inference Using PHP: Part 2 Algebraic cleverness

Implement Bayesian inference using PHP: Part 2

By Paul Meagher - 2004-05-19 Page: 1 2 3 4 5 6 7 8 9 10

Algebraic cleverness

When estimating p using maximum likelihood analysis, you use the binomial formula to compute the likelihood of p:

P( k/n | p_i) = _nC_k (p)^k (1 - p) ^{(n - k)}

The computation involved keeping the k and n parameters fixed while varying p_i and then seeing which value of p_i maximized the likelihood of the results k/n. If you examine the above equation, you should note that the value of _nC_k will remain constant as you vary p_i. This implies that you can drop this term from the equation without affecting the shape of the likelihood distribution or the MLE. To confirm this, you can modify the likelihood graphing code by replacing this line:

$likelihoods[$i] = binomial($n, $k, $p);

with this line:

$likelihoods[$i] = pow($p, $k) * pow(1-$p, $n-$k);

This is simply the binomial formula without the combinations term. When you do this, you get the following graph:

Figure 2. The likelihood distribution graph (reduced formula)

Note that the MLE value is smaller than before, but that 0.20 is still the MLE of p. As the likelihood distribution is not a probability distribution, these reduced values are immaterial -- the shape and maxima are all that really matters. From now on, you can use this reduced formula to compute the MLE of p.

P( k/n | p_i) = (p)^k * (1 - p) ^{(n - k)}

Another bit of algebraic cleverness involves eliminating the exponents and multiplications by taking the logarithm of each term appearing on the right-hand side. When you do so, we obtain the log likelihood formula (commonly denoted with a capital L):

L( k/n | p_i) = log( P(k/n | p_i) ) L( k/n | p_i) = log( (p)^k * (1 - p) ^{(n - k)} )
L( k/n | p_i) = k * log(p) + (n - k) * log(1 - p)

Note that taking the log of a formula with exponents in it changes the exponents into multipliers. Also, terms that were multiplied are now added. It is often easier to find the derivative of 0 with the log version of the formula (to find the MLE).

To convince yourself that the log likelihood formula can be used to derive the MLE of p, you can modify the likelihood graphing code by replacing this line:

$likelihoods[$i] = pow($p, $k) * pow(1-$p, $n-$k);

with this line:

$log_likelihoods[$i] = $k * log($p) + ($n - $k) * log(1 - $p);

which results in this graph:

Figure 3. The likelihood distribution graph (log likelihood formula)

As you can see, the shape has changed somewhat but the MLE of p is still 0.20. Note also that the first graphed point does not start at 0 because the log of 0 produces an infinite value. The simple solution I adopted was to start plotting with a p value of 0.05 instead of 0.

It was necessary to show you these alternate formulas for computing the likelihood of p because statisticians often switch between them in different contexts based on the mathematical convenience of doing so. The log likelihood version is especially important because you often see abundant use of logarithms in the context of logistic regression which is often used to analyze multivariate surveys and experimental data having a binary dependent measure that one wants to predict and explain. Logistic regression uses maximum likelihood techniques to estimate (theta) on the basis of several explanatory response variables (for n explanatory variables):

= P( Y=1 | x_i1, ..., x_in )

Logistic regression is used for estimation, prediction, and modeling purposes and is an important technique you should learn if you want to design and analyze multivariate binary surveys.

View Implement Bayesian inference using PHP: Part 2 Discussion

Page: 1 2 3 4 5 6 7 8 9 10 Next Page: Bayes estimators

First published by IBM developerWorks