KDD process

Description:

The non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data - Fayyad, Platetsky-Shapiro, Smyth (1996)

  • non-trivial process (Multiple process)
  • valid (Justified patterns/models)
  • novel (Previously unknown)
  • useful (Can be used)
  • understandable (by human and machine)
KDD is inherently interactive and iterative a step in the KDD process consisting of methods that produce useful patterns or models from the data, under some acceptable computational efficiency limitations.
  1. Understand the domain and Define problems
  2. Collect and
  3. Preprocess Data
  4. Data Mining
  5. Extract Patterns/Models
  6. Interpret and Evaluate discovered knowledge
  7. Putting the results in practical use

Publications: Chapman/etal/2000a: CRISP--DM 1.0