|
Concept Learning
Name |
Concept Learning |
Description |
The task of concept learning is to acquire general concepts from specific training examples. Training examples
are instances, which either belong to a special concept, and therefore are positive examples, or do not belong
to the concept, and therefore are negative examples.
Given
- a description language LE for examples,
- a hypothesis language LH for possible hypotheses (i.e. possible learning results),
- a set of positive examples (instances of a certain concept)
- a set of negative examples (instances that do not belong to the concept),
- a predicate "covers", which indicates whether a given concept/hypothesis covers a given example,
- an acceptance criterion (measure) which evaluates hypotheses,
Find hypotheses in the language LH fulfilling the acceptance criterion.
The partial order of LH can be used to prune the search.
|
Definition |
One can distinguish two more specific concept learning tasks, each related to another theory.
Function Approximation may be regarded as a third specialization.
1. The concept learning task in the predictive setting of
Inductive Logic Programming is: Given
- positive (E+) and negative (E-) examples E in an example language
LE,
- background knowledge B in a language LB,
with B ∪ E ⊭⊥ and B ⊭ E.
The task is to find a hypothesis h in LH,
- which is consistent with B and E: B, H, E ⊭ ⊥
- that together with B implies all of the positive examples in E:
B, H ⊨ E+
- that together with B does not imply any of the negative examples in E:
∀e ∈ E- : B, H ⊭ e
Usually LB and LE are chosen as sets of ground facts, LH is a restricted first order logic and
the logical implication is modeled by an inference operator.
For selection from a set of correct and complete hypotheses, there often is a special preference operator.
2. The scenario of Concept Learning in the PAC-paradigm: A
learning algorithm
- for concepts c of a representation class C
- reads a set of examples X
- from an example language LE
- with respect to an unknown, "correct" probability distribution D over LE.
- The given examples are classified,
that is for every example e in X it is given whether e belongs to concept c or not.
Additionally the learning algorithm receives two parameters, "δ" and "ε", weakening the
requirements of possible learning results.
- ε defines the maximum error, up to which hypotheses are regarded to classify a given
concept approximately correct.
The error of a hypothesis h, refering to a concept c, is the probability, an example drawn from LE, with
respect to the given probability distribution D, is classified wrong by h.
- δ is an upper bound for the probability, the algorithm does not learn the given concept
approximately correct.
A concept class C is probably approximately correct (PAC-) learnable by a hypotheses class H from
a set of (positive and negative classified) examples X, if there is an algorithm L(δ , ε), which
can output a hypothesis h from H with an error of at most epsilon,
- for every fixed probability distribution D over LE,
- for all concepts c in C,
- in time, limited by a polynomal in 1/ε , 1/δ and |c|,
- with a probability of at least 1 - δ,
where
- |c| is the (variously defined) representation length, for example
based on the length of the most efficient representation of c and
- 0 < δ < 1/2 and 0 < ε < 1/2.
|
Methods |
Boosting
Bottom-Up Induction of Horn Clauses
GOLEM
Incremental decision tree learning
Lazy Learning
Logistic Regression
Naive Bayes
PROGOL
Searching Version Spaces
Star
Support Vector Machine (SVM)
Top-Down Induction of Decision Trees (TDIDT)
Top-Down Induction of Horn Clauses
|
Theories |
Bayesian Learning
Inductive Logic Programming (ILP)
Learning as Search
Probably Approximately Correct (PAC-) Learning
Statistical Learning
|
|
|