Star

Name Star
Description

The Star method is suitable for the task of concept learning as well as for conceptual clustering. It can deal with different kinds of attribute-value representations of examples. The idea is to find all most general descriptions, that separate an example set from other example sets. Then, according to a certain preference criterion, the best of all descriptions is chosen. The concepts or clusters generated that way may be specialized afterwards, to make them disjoint.

Let P denote a set of instances to be covered and N a set of instances that should not be covered.
A partial star is a conjunction of conditions testing attribute-values.
Such a partial star is called consistent with P and N, if it covers every p ∈ P, but no n ∈ N.

STAR(P|N), the star of P against N, is defined as the disjunction of all maximally general partial stars, consistent with P and N.

As a quality measure for stars and partial stars, a so called "Lexical Evaluation Function" (LEF) is used.

For efficiency reasons a restriction of STAR(P|N), the bounded star STAR(P|N,s) may be of interest:
If |STAR(P|N)| > s, then STAR(P|N,s) contains just the s best partial stars of STAR(P|N), according to the LEF.
Otherwise STAR(P|N,s) = STAR(P|N).

For algorithms based on the star-method, it is often sufficient to build stars of single instances of the example language against a set of examples.

Let

  • p be an instance to be covered,
  • N be a set of instances that should not be covered,
  • s be a parameter for bounding complexity.

How to generate STAR({p}|N,s):

  • Generate a list of partial stars with exactly one condition, taken from p. These partial stars may still cover an n ∈ N, and therefore be inconsistent.
  • If there are more than s partial stars in the list, remove all, except from the best s ones, according to the LEF.
  • As long as there are inconsistent partial stars, that can be extended by adding a conjunction, such that the result has not been generated before, do:
    • Extend all inconsistent partial stars by one condition and add them to the list.
    • If the list contains more than s elements, just keep the best s ones.
  • Remove all inconsistent partial stars from the list.
  • Output the list, ordered by the LEF.

Specialization AQn
Example Languages Attribute-Value
Dm Step Clustering (Unsupervised Learning)
Concept Learning
Method Type Method