Implement Bayesian Inference Using PHP: Part 2 Computing the MLE

Implement Bayesian inference using PHP: Part 2

By Paul Meagher - 2004-05-19 Page: 1 2 3 4 5 6 7 8 9 10

Computing the MLE

In the context of surveys with questions having only binary response options, you can model the distribution of responses as a binomial random variable: a variable that can only take on one of two values. Given this probability distribution model, one of the parameters you want to estimate using your survey data is the probability of success for a given question where success (denoted as p) can be defined as the probability that participants will give a 1-coded response. The letter q can be used to denote "failure" (a 0-coded response) and is a probability value equal to 1 - p.

To see how to compute an MLE of p, imagine that you have the following survey data to base your estimate of p on (the probability of responding "yes"):

Table 3. Survey data basis of estimate of p

participant	q1
1	0
2	1
3	0
4	0
5	0

To estimate p for question 1 using maximum likelihood techniques, you need to try out various values of p to see which one maximizes the conditional probability of the observed results R:

P(R | p_i)

The results R can be summarized as the proportion of successes k observed among n sample items k/n. From the table above, one in five (or 20 percent) of the participants responded with a 1-coded response. Therefore, to compute the MLE of p, you try out various values of p and see which one maximizes the conditional probability of k/n:

MLE = max( P(k/n | p_i) )

Giventhat the distribution of question scores can be modeled as a binomial random variable, you can use the binomial distribution function to calculate the probability of the observed results. The binomial distribution function returns the likelihood of an event occurring $k times in $n attempts, in which the probability of success on a single attempt is $p:

$likelihood = binomial($n, $k, $p);

The equation for computing the binomial probability of a particular result k/n given a particular value of p is:

P(k/n | p_i) = _nC_k * (p)^k * (1 - p) ^{(n - k)}

A binomial probability is a product of three terms:

The number of ways of selecting k items from n items: _nC_k.
The probability of success raised to an exponent equal to the number of success events involved in the outcome: p^k.
The probability of failure raised to an exponent equal to the number of failure events involved in the outcome: 1 - p ^{(n - k)}.

The code for computing a binomial probability looks like this:

Listing 1. Computing a binomial probability



<?php

# Probability functions ported from Mastering Algorithms 
# With Perl by Macdonald, Orwant, and Hietaniemi 

# choose($n, $k) is the number of ways to choose $k 
# elements from a set of $n elements, when the order 
# of selection is irrelevant. 

function choose($n, $k) { 
  $j=1; $result=1; 
  if ( ($k > $n) || ($k < 0) ) { 
    return 0; 
  } 
  while ($j<=$k) { 
    $result *= $n--; 
    $result /= $j++; 
  } 
  return $result; 
} 

function binomial($n, $k, $p) { 
  if ($p == 0) { 
    if ($k == 0) 
      return 1;
    else 
      return 0;
  } 
  if ($p == 1) { 
    if ($k == $n)
      return 1;
    else
      return 0;
  } 
  return choose($n, $k) * pow($p, $k) * pow(1-$p, $n-$k);
}

?>

So to determine the value of $p that maximizes the observed results, you can simply construct a loop where you keep the values of $n and $k fixed on each iteration but vary the probability of success $p for a particular trial (as in the following listing):

Listing 2. Constructing a loop to determine the value of $p


<?php

include "binomial.php";

$i = 0;
$n = 5;
$k = 1;

for($p = 0.00; $p <= 1.00; $p = $p + 0.05 ) {
  $likelihoods[$i] = binomial($n, $k, $p);
  $parameters[$i]  = $p;
  $i++;
} 

// Maximum likelihood value
$mle = max($likelihoods);

// The p value that results in the maximum likelihood value
$p   = $parameters[array_search($mle, $likelihoods)];

?>

It will be instructive to see a graph of these results.

View Implement Bayesian inference using PHP: Part 2 Discussion

Page: 1 2 3 4 5 6 7 8 9 10 Next Page: Graphing the likelihood distribution

First published by IBM developerWorks