Probably Approximately Correct (PAC-) Learning

Publication Blumer/etal/90a: Learnability and the Vapnik--Chervonenkis dimension
Haussler/99a: Convolution Kernels on Discrete Structures
Mitchell/97b: Machine Learning
Natarajan/91a: Machine Learning. A Theoretical Approach

Name Probably Approximately Correct (PAC-) Learning

Description
In the PAC-theory the topic is learnability of

concept classes C

over instance spaces LE

by certain hypotheses spaces H,

considering an unknown but fixed probability distribution D over LE, modeling that there is a probability for each instance of LE to occur.

All concepts of the concept classes C and H are subsets of LE.
An example e ∈ LE is classified according to a concept c, if it is given whether e ∈ c or not.

Probably Approximately Correct learnability means, that there is an algorithm A that is able to

select an ε-good hypothesis from H

with a probability of at least 1 - δ

for each δ, ε with 0 < δ < 1/2 , 0 < ε < 1/2.

To find such a hypothesis the algorithm is allowed to read a set X of classified examples, drawn with respect to D.
A hypothesis h is called ε-good, if the probability error_D(h, c) an instance e ∈ LE is drawn and "classified wrong" by h is at most ε:
error_P(h, c) := P(x ∈ (c \ h) ∪ (h \ c))

Two main questions can be distinguished in PAC-theory:

How many examples are needed to learn Probably Approximately Correct?
Is there a polynomal p for a specific concept class C, such that there is an algorithm A, which can find an ε - good hypothesis for each concept c ∈ C with a probability of at least 1 - δ, for each 0 < δ < 1/2 and 0 < ε < 1/2, after having read at most p(1/δ, 1/ε) classified examples from LE, drawn with respect to P ?

How complex is the learning task?
If point 1 is true, is there an algorithm A` and a polynomal p` such that A` can find an ε-good hypothesis in time p`(1 / δ, 1 / ε, |c|), where |c| denotes the representation length of the actual concept c to be learned ?

If question 2 can be answered yes, so can question 1, because in time p`(1/δ, 1/ε, |c|) the cardinality of the set of read examples is bound by a polynomal in 1/δ, 1/ε.

Dm Step Concept Learning
Function Approximation