Statistical Approaches used in Machine Learning (English)

Description: September 24, 2004

Machine Learning represents the new deal of statistical inference once powerful computational tools have been made available to scientists. The objects we want to infer are not yet simple parameters but entire functions. The data we process are not simple independent observations of a phenomenon; rather they represent complex links between different variables characterizing it. The inferenceƕs success depends highly on the sagacity of the algorithms processing these data in relation to their inner structure we want to discover. This tutorial provides a statistical framework for perceiving, discussing and solving the key inference problems on which a large family of machine learning instances are rooted.

The paradigmatic context is a string of data (possibly of infinite length) that we partition into a prefix we assume to be known at present (and therefore call a sample) and a suffix of unknown future data (we call a population). All these data share the feature of being observations of the same phenomenon, which is exactly the object of our inference. The basic inference tool is a twisting between properties we establish on the sample and random properties we are wondering on about the population, such as the probability of matching a specific digit.

Moving from the elementary problem of estimating the parameter of a Bernoulli variable, we will revisit two basic inference tools: the computation of confidence intervals and the search for point estimators with nice properties. Then we will go on to learning problems: while the theoretical tools remain unchanged, the sample properties to be twisted on the population must be wisely devised and smartly computed. As for Boolean variables, we restate the bases of PAC learning theory facing the usual related issues, such as: i) curse of dimensionality, ii) corrupted examples, and iii) special learning devices such as Support Vector Machines. Finally we will touch a few general statistical sentences that can be stated around neural network learning algorithms.
Lecturer: Apolloni, B.
Malchiodi, D.
Language: English
URL:
Matrial: T6.pdf (2525 KB)
Date: 2004
Address: ECML/PKDD2004