Description |
The algorithm FOIL addresses the task of concept learning
by using the method of top-down induction of horn clauses.
More specific the task of FOIL is to find a description of a target
predicate, given a set of examples and some background
knowledge, both in the form of function-free ground facts.
Examples differ from background knowledge by containing the target
predicate.
FOIL makes the closed world assumption, so it assumes that any
function-free ground fact not declared true is false.
The hypothesis language of FOIL is a set of sets of
rules of the following form:
T(x1, ..., xn) ← L1, ...,
Lk,
where T is the target predicate, all xi are variables and
all Li are positive or negative constant-free and
function-free literals.
The general method of FOIL is to search the hypothesis space by
adding one rule by another, and by constructing each rule
literal by literal, guided by a heuristic measure, similar
to information gain. The search for a hypothesis fitting the data is
reduced to parts of the search space with highest information gain,
which does not necessarily lead to optimal results.
The Algorithm:
As long as there are positive examples of the target predicate,
which are not covered by any of the rules, repeat:
- Add another rule. Initialize the new rule with the target
predicate as the conclusion and without any preconditions.
- As long as the new rule covers a negative example of the
target predicate, repeat:
- Search the list of possible extensions of the rule. This list consists of ...
- ... all positive or negative literals, containing a predicate found in the background
knowledge, not containing constants or functions but at least one of the variables
already part of the formula. By allowing the target predicate at this point, FOIL
is able to learn recursive definitions, but has to be kept from infinite recursion.
- ... tests for equality and for unequality of variables, already occuring in the rule.
- Add a conjuntion of the possible extension to the preconditions, that yields highest
Gain, where Gain is FOIL's variant of the information gain measure.
The result is a set of rules, defining the target predicate such, that every positive, but
none of the negative examples is covered by at least one rule. |