Developer Forums | About Us | Site Map
Search  
HOME > TUTORIALS > SERVER SIDE CODING > PHP TUTORIALS > APPLY PROBABILITY MODELS TO WEB DATA USING PHP


Sponsors





Useful Lists

Web Host
site hosted by netplex

Online Manuals

Apply probability models to Web data using PHP
By Paul Meagher - 2004-04-14 Page:  1 2 3 4 5 6 7 8 9 10 11

Probability distribution superclass

If you examine the JSci Probability Distribution superclass (ProbabilityDistribution.java), you would notice obvious similarities to the PHP version of this class shown in Listing 3. One major difference, however, is that the PHP version is designed so that it can co-exist nicely with other PEAR classes.

PEAR is short for PHP Extension and Application Repository and is the official structured library of open source code for PHP users. The PEAR Group also advocates a standard style for code written in PHP. The recommended PEAR coding style and good Java programming style have many similarities. These similarities mean that it is relatively easy to turn good Java code into PEAR-conformant code.

Three main issues arise and can cause some difficulties in porting code from Java to PHP4:

  • Lack of native support for namespaces in PHP means that my class names are longer than you might see in Java (such as PHPMath_ProbabilityDistribution_General).
  • Lack of native support for polymorphous constructors in PHP means that rather than declaring your class with a variable number of arguments which cause different constructors to be invoked, you try to achieve the same effect through setting argument defaults and doing different things depending on whether a default argument is supplied or what type of argument it is. Often I cannot achieve the same effect in PHP (except through ugly workarounds), so I might just implement the most commonly used constructor.

    Also, in PHP you do not need to statically define the type of your function arguments. When calling any PHP function for any given argument slot, you can pass in a single value or an array as your argument. Within such PHP functions, you can add logic that detects the argument type and uses this information to carry out different operations. In other words, you can use PHP's type-indifferent argument-passing protocol and type-detection code to achieve constructor polymorphism.

  • Lack of native support for advanced exception handling in PHP (which will change with PHP5) means that you have to rely upon the forward-compatible PEAR.php error handling class to flag and deal with errors.

While support for these OO features in PHP would be desirable, they are arguably not necessary; workarounds can achieve similar effects. The tradeoff is that PHP easier to learn and to use productively for solving Web scripting problems.

The probability distribution superclass (Listing 3) defines the methods that need to be instantiated by all probability distribution classes. It also defines methods and constants that can be used by child classes.

Listing 3. Probability distribution superclass, PHPMath_ProbabilityDistribution_General

<?php

/**
* @package PHPMath_ProbabilityDistribution
*/

define("PHPMATH_MAX_FLOAT", 3.40282346638528860e+305);

include_once 'PEAR.php';

/**
* The PHPMath_ProbabilityDistribution_General superclass 
* provides an object for encapsulating probability distributions.
* @version 1.0
* @author Jaco van Kooten
* @author Paul Meagher
* @author Jesus Castagnetto
*/

class PHPMath_ProbabilityDistribution_General {

  /**
  * Constructs a probability distribution.
  */
  function PHPMath_ProbabilityDistribution_General() {}

  /**
  * Probability density function.
  * @return the probability that a stochastic variable x 
  * has the value X, i.e. P(x=X).
  */
  function PDF($X) {}

  /**
  * Cumulative distribution function.
  * @return the probability that a stochastic 
  * variable x is less then X, i.e. P(x<X).
  */
  function CDF($X) {}

  /**
    * Inverse of the cumulative distribution function.
  * @return the value X for which P(x<X).
  */        
  function InverseCDF($probability) {}

  /**
    * Inverse of the cumulative distribution function.
  * @return the value X for which P(x<X).
  */        
  function RNG($num_vals) {}

  /**
  * Check if the range of the argument of the distribution 
  * method is between <code>lo</code> and <code>hi</code>.
  * @exception OutOfRangeException If the argument is out of range.
  */
  function checkRange($x, $lo=0.0, $hi=1.0) {
    if (($x < $lo) || ($x > $hi)) {
      return PEAR::raiseError("The argument of the distribution method 
           should be between $lo and $hi.");
    }
  }

  /**
  * Get the factorial of the argument
  * @return factorial of n.
  */ 
  function getFactorial($n) {
    return $n <= 1 ? 1 : $n * $this->getFactorial($n-1);
  }
    
  /**
  * This method approximates the value of X for which P(x<X)=<I>prob</I>.
  * It applies a combination of a Newton-Raphson procedure and bisection 
  * method with the value <I>guess</I> as a starting point. Furthermore, 
  * to ensure convergency and stability, one should supply an interval 
  * [<I>xLo</I>,<I>xHi</I>] in which the probability distribution reaches 
  * the value <I>prob</I>. The method does no checking, it will produce
  * bad results if wrong values for the parameters are supplied - use it 
  * with care.
  */    
  function findRoot($prob, $guess, $xLo, $xHi) {                    
    $accuracy     = 1.0e-10;
    $maxIteration = 150;
    $x     = $guess;
    $xNew  = $guess;
    $error = 0.0;
    $pdf   = 0.0; 
    $dx    = 1000.0;
    $i     = 0;    
    while ( (abs($dx) > $accuracy) && ($i++ < $maxIteration) ) {
      // Apply Newton-Raphson step
      $error = $this->CDF($x) - $prob;
      if($error < 0.0) {
        $xLo = $x;
      } else {
        $xHi = $x;
      }
      $pdf = $this->PDF($x);
      // Avoid division by zero      
      if ($pdf != 0.0) { 
        $dx   = $error / $pdf;
        $xNew = $x - $dx;
      }

      // If the NR fails to converge (which for example may be the
      // case if the initial guess is to rough) we apply a bisection
      // step to determine a more narrow interval around the root.
      if ( ($xNew < $xLo) || ($xNew > $xHi) || ($pdf==0.0) ) {
        $xNew = ($xLo + $xHi) / 2.0;
        $dx   = $xNew - $x;
      }
      $x = $xNew;
    }
    return $x;
  }  
    
}
?>

It should be noted that I modified the JSci API to use common textbook abbreviations for accessing the core distribution functions (such as, PDF(), CDF(), InverseCDF(), RNG()). I also enforce the idea that all classes should instantiate a Random Number Generating (RNG) method. RNG methods in particular are generally not as easy to implement as the other methods and may be one reason they were not included in the initial implementation of the Probability distribution superclass.

I have also provisionally added a PHPMATH_MAX_FLOAT constant and added a getFactorial utility method. A more mature PHP Math library might include these generally useful constants and methods in separate files so that they could be included in a wider range of math classes.



View Apply probability models to Web data using PHP Discussion

Page:  1 2 3 4 5 6 7 8 9 10 11 Next Page: And now, Exponential distribution

First published by IBM developerWorks


Copyright 2004-2017 GrindingGears.com. All rights reserved.
Article copyright and all rights retained by the author.