Top-Down Induction of Decision Trees (TDIDT)

Publication Quinlan/86a: Induction of Decision Trees
Quinlan/93b: C4.5: Programs for Machine Learning
Wettscherek/Dietterich/95a: An Experimental Comparison of the Nearest-Neighbour and Nearest-Hyperrectangle Algorithms

Name Top-Down Induction of Decision Trees (TDIDT)

Description
Top-down induction of decision trees is a method for the task of concept learning (supervised learning).

Given:
A set of classified examples in attribute-value representation, where the set of classes is finite.

The Method:
A decision tree is constructed top-down. In each step a test for the actual node is chosen (starting with the root node), which best seperates the given examples by classes. The quality of a test is measured by the impurity / variance of example sets. The most common measure is the information gain.

A test is a mapping from the set of all instances of the underlying example language to a finite set of possible results. In the decision tree there is a successor of the actual node for each possible result of the test. The given set of examples is split with respect to the chosen test. For each successor that does not meet a given acceptance criterion, e.g. that all examples are of the same class, the procedure is started recursively.

Often the set of possible tests is limited to splitting the examples according to the value of a certain attribute, so the method is restricted to choosing a good attribute for each inner node. If numerical attribute values are incorporated, tests may compare the value of certain attributes to appropriate constants.
The aim of top-down induction of decision trees is to find a decision tree that is as small as possible and fits the data, because it is assumed that it is likely to find a good generalization of the training data that way.

Specialization C4.5
ID3

Example Languages Attribute-Value

Hypothesis Language Decision Trees

Dm Step Concept Learning

Method Type Method