|
Symbolic Data Analysis (English)
Description: |
September 20, 2004
In conjunction with Mining Complex Data Structures workshop.
The need to extend standard exploratory, statistical and graphical data analysis methods to more complex data, that go beyond the classical framework is increasing, in order to describe complex units or concepts, to get more accurate information and to summarise extensive data sets contained in huge databases. This is the case of data concerning more or less homogeneous classes or groups of individuals - second-order objects or macro-data - instead of single individuals - first-order objects or micro-data. The extension of classical data analysis techniques to the analysis of second-order objects is one of the main goals of a novel research field named "Symbolic Data Analysis".
Symbolic Data Analysis allows defining concepts by a query on a database, aggregate initial data in order to describe these concepts (as symbolic data) and then apply analysis methods to extract knowledge from the set of modelled concepts. Symbolic data extend the classical tabular model, allowing multiple, possibly weighted, values for each descriptive attribute, which allow representing variability and/or uncertainty present in the data. Symbolic Data Analysis methods include univariate descriptive methods, visualizing methods, clustering, decision-trees, discrimination, regression, factorial analysis techniques and conceptual lattices, which allow analysing symbolic data tables.
Symbolic data occur in many situations, for instance in summarising huge sets of data or in describing the underlying concepts - a town, a socio-demographic group, a scenario of accidents - of a database. It also finds an important application field in official statistics; since by law, NSI's are prohibited from releasing individual responses to any other government agency or to any individual or business, data are aggregated for reasons of privacy before being distributed to external agencies and institutes. Symbolic Data Analysis provides useful tools to analyse such aggregated data.
Symbolic Data Analysis allows solving problems that arise in data analysis, in particular:
- Large Database Treatment
- Confidentiality
- Missing Data
- Metadata Modelling
- Quality Control on Statistical Production
- Accurate Data Interpretation
- Use of Confidence Intervals
- Joining of Independent Surveys
- Exploitation of Survey Databases
Symbolic Data Analysis underwent great improvement with the European projects Symbolic Official Data Analysis System (SODAS) and Analysis System for Symbolic Official Data (ASSO) . As the result of these projects a software package SODAS has been developed. |
Lecturer: |
Diday, Edwin
Marcelo, Carlos
|
Language: |
English |
URL: |
|
Matrial: |
T2.pdf (15954 KB) |
Date: |
2004
|
|
|