Description |
The Star method is suitable for the task of concept learning
as well as for conceptual clustering.
It can deal with different kinds of attribute-value representations
of examples.
The idea is to find all most general descriptions, that separate an
example set from other example sets. Then, according to a certain
preference criterion, the best of all descriptions is chosen.
The concepts or clusters generated that way may be specialized
afterwards, to make them disjoint.
Let P denote a set of instances to be covered and N a
set of instances that should not be covered.
A partial star is a conjunction of conditions testing
attribute-values.
Such a partial star is called consistent with P and N, if
it covers every p ∈ P, but no n ∈ N.
STAR(P|N), the star of P against N, is defined as the
disjunction of all maximally general partial stars, consistent with P
and N.
As a quality measure for stars and partial stars, a so called
"Lexical Evaluation Function" (LEF) is used.
For efficiency reasons a restriction of STAR(P|N), the bounded star
STAR(P|N,s)
may be of interest:
If |STAR(P|N)| > s, then STAR(P|N,s) contains just the s best partial
stars of STAR(P|N), according to the LEF.
Otherwise STAR(P|N,s) = STAR(P|N).
For algorithms based on the star-method, it is often sufficient to
build stars of single instances of the example language against a
set of examples.
Let
- p be an instance to be covered,
- N be a set of instances that should not be covered,
- s be a parameter for bounding complexity.
How to generate STAR({p}|N,s):
- Generate a list of partial stars with exactly one
condition, taken from p. These partial stars may still
cover an n ∈ N, and therefore be inconsistent.
- If there are more than s partial stars in the list,
remove all, except from the best s ones, according to the
LEF.
- As long as there are inconsistent partial stars, that
can be extended by adding a conjunction, such that the
result has not been generated before, do:
- Extend all inconsistent partial stars by one
condition and add them to the list.
- If the list contains more than s elements, just
keep the best s ones.
- Remove all inconsistent partial stars from the list.
- Output the list, ordered by the LEF.
|