Top-Down Induction of Regression Trees

Name Top-Down Induction of Regression Trees

Description
The Top-Down Induction of Regression Trees addresses the task of function approximation. Nevertheless the method is quite similar to top-down induction of decision trees.

Given:

a set E of instances of the examples language in attribute-value representation

for each instance x ∈ E its function value f(x)

The method:

A regression tree is constructed top-down.

In each step a test for the actual node is chosen (starting with the root node). The search for good tests is guided by variably defined quality measures, which reward tests that induce partitions which subsume instances with similar function values.

As in decision trees in regression trees a test is a mapping from the set of all instances of the underlying example language to a finite set of possible results. There is a successor of the actual node for each possible result of the test.

The given set of examples is split with respect to the chosen test.

For each successor that does not meet a given acceptance criterion, the procedure is called recursively.

Each leaf of a regression tree is marked by a real value. The value is, of course, chosen such, that the actual quality measure for the leaf is maximized. Alternatively a linear function may be attached to leaf nodes, if the attributes describing examples are numerical as well.

When the method of regression trees is invoked, there are often numerical attribute values, and tests may compare the value of certain attributes to appropriate constants.
The aim of regression trees usually is to find a good generalization of the training data, which means good predictions of yet unseen values of the actual function. Another possible aim is to analyze which variables are well suited to predict a target value.

Specialization CART

Dm Step Function Approximation

Method Type Method