Simple Linear Regression With PHP: Part 1 Guiding principles

Simple linear regression with PHP: Part 1

By Paul Meagher - 2004-05-12 Page: 1 2 3 4 5 6 7 8

Guiding principles

I used six general principles to guide the development of the SimpleLinearRegression class.

Establish one class per analytical model.
Employ backward chaining to develop the class.
Expect an abundance of getters.
Store intermediate results.
Develop a preference for a verbose API.
Perfection is not the goal.

Let's look at each of these guidelines in more detail.

Establish one class per analytical model

Each major type of analytical test or procedure should have a PHP class by the same name that contains input functions, functions that compute intermediate and summary values, and output functions (functions that dump the intermediate and summary values to the screen in textual or graphic form).

Employ backward chaining to develop the class

In mathematical programming, the goal of coding is often the standard output values that the analytical procedure (such as MultipleRegression, TimeSeries, or ChiSquared) is expected to generate. From a problem-solving point of view, this means that you can employ backward chaining to develop the methods of a mathematical class.

For example, the summary output screen displays one or more summary statistics. These summary statistics depend upon computing intermediate statistics, which may in turn involve further intermediate statistics, and so on. This backward-chaining-based development methodology leads to the next principle.

Expect an abundance of getters

The majority of the class-development work for a mathematical class involves computing intermediate and summary values. Practically, this means that you should not be suprised if your class contains many getter methods that compute intermediate and summary values.

Store intermediate results

Store the results of intermediate calculations inside a result object so you can use intermediate results as input to subsequent calculations. This is a principle that is enforced in design of the S language. In the present context it is enforced by selection of instance variables to represent computed intermediate and summary results.

Develop a preference for a verbose API

When developing the naming scheme for member functions and instance variables in my SimpleLinearRegression class, I found it easier to keep track of what my functions did and what my variables stood for when I used longer names to describe them (names like getSumSquaredError instead of getYY2).

I did not totally give up on abbreviating the names; however, when I did abbreviate, I tried to provide comments to fully elaborate the meaning of the name. Highly abbreviated naming schemes are too common in mathematical programming in my opinion -- they make it more difficult than necessary to understand and verify that a mathematical routine works as it should.

Perfection is not the goal

The goal of this coding exercise is not necessarily to develop a highly optimized and rigorous math engine for PHP. In these early stages, the learning and challenge aspects of implementing significant analytical tests should be emphasized.

View Simple linear regression with PHP: Part 1 Discussion

Page: 1 2 3 4 5 6 7 8 Next Page: The instance variables

First published by IBM developerWorks