Regression Rules

Publication Quinlan/92a: Learning with Continuous Classes
Quinlan/93b: C4.5: Programs for Machine Learning
Quinlan/93c: Combining Instance-based and Model-based Learning
Torgo/95a: Data Fitting with Rule-based Regression
Torgo/Gama/96a: Regression by Classification
Weiss/Indurkhya/93a: Rule--Based Regression
Weiss/Indurkhya/95a: Rule-based Machine Learning Methods for Functional Prediction

Name Regression Rules

Description
Weiss and Indurkhya (1993) have developed a system (SWAP1R) that learns regression rules in a propositional language. The conditional part of the rules consists of a conjunction of tests on the input attributes while the conclusion part contains the associated prediction of the target variable value. Originally these predictions consisted of the average Y value of the cases satisfying the conditional part, but later the possibility of using k-nearest neighbours was added (Weiss & Indurkhya, 1995). This system has the particularity of dealing with regression by transforming it into a classification problem. In effect, the target values of the training cases are pre-processed by grouping them into a set of user-defined bins. These bins act as class labels in the subsequent learning phase. This means that the learning algorithm induces a classification model of the data, where the classes are numbers representing the obtained bins . These numbers can then be used to make numeric predictions for unseen cases. This idea of transforming a regression problem into a classification task was taken further in the work of Torgo and Gama (1996), where a system called RECLA acts as a generic pre-processing tool that allows virtually any classification system to be used in a regression task.

Torgo (1995) has developed a propositional regression rule learner (R2) that also uses an IF-THEN rule format. This algorithm can use several different functional models in the conclusion of the rules . R2 uses a covering algorithm that builds new models while there are uncovered cases to fit. The model is chosen from the model lattice that includes all possible regression models for the conclusion part of the rules. The result of this first step is a rule with empty conditional part and the built model as conclusion. This rule is then specialised by adding conditions to restrict the model domain of applicability with the goal of improving its fit. This restriction is guided by an evaluation function that weighs both the fitting as well as the degree of coverage of the rule. This means that the system is looking for rules with good fit but covering as many cases as possible.

CUBIST is a recent commercial system developed by Ross Quinlan. It learns a set of unordered IF-THEN rules with linear models in the conclusion part. Up to now there is no published work on CUBIST but from our experience with the system it looks like it is a rule-based version of M5 (Quinlan, 1992,1993b). It seems to be a kind of C4.5rules (Quinlan, 1993a) for M5. The system is able to deal with both continuous and nominal variables, and obtains a piecewise linear model of the data. CUBIST can also combine the predictions of this model with k--nearest neighbour predictions (as M5 does).

Method Type Method