Within the data mining process considerable time is spent for pre-processing the data. Practical experiences have shown that the time spent on preprocessing can take from 50% up to 80% of the entire data mining process when using the traditional attribute-value learners. Thats why preprocessing is the key issue in data analysis. The time is spend for:
Experienced users can apply any learning system successfully to any application, since they prepare the data well. The representation of examples and the choice of a sample determines the applicability of learning methods. A chain of data transformations (learning steps or manual preprocessing) delivers the desired result. Experienced users remember prototypical successful transformation/learning chains.
Euler/2006a | Timm Euler. Data Mining mit MiningMart. In Programmieren unter Linux, No. 1, pages 56--60, 2006. |
Euler/2006b | Timm Euler. Modeling Preparation for Data Mining Processes. In Journal of Telecommunications and Information Technology, No. 4, pages 81--87, 2006. |
Euler/2005a | Timm Euler. Publishing Operational Models of Data Mining Case Studies. In Proceedings of the Workshop on Data Mining Case Studies at the 5th IEEE International Conference on Data Mining (ICDM), pages 99--106, Houston, Texas, USA, 2005. |
Euler/2005b | Timm Euler. An Adaptable Software Product Evaluation Metric. In Proceedings of the 9th IASTED International Conference on Software Engineering and Applications (SEA), Phoenix, Arizona, USA, 2005. |
Euler/2005c | Timm Euler. Churn Prediction in Telecommunications Using MiningMart. In Proceedings of the Workshop on Data Mining and Business (DMBiz) at the 9th European Conference on Principles and Practice in Knowledge Discovery in Databases (PKDD), Porto, Portugal, 2005. |
Euler/2005d | Timm Euler. Modelling Data Mining Processes on a Conceptual Level. In Proceedings of the 5th International Conference on Decision Support for Telecommunications and Information Society, Warsaw, Poland, 2005. |
Euler/Scholz/2004a | Euler, Timm and Scholz, Martin. Using Ontologies in a KDD Workbench. In Buitelaar, P. and Franke, J. and Grobelnik, M. and Paa?, G. and Svatek, V. (editors), Workshop on Knowledge Discovery and Ontologies at ECML/PKDD '04, pages 103--108, Pisa, Italy, 2004. |
Morik/Koepcke/2004a | Morik, Katharina and Köpcke, Hanna. Analysing Customer Churn in Insurance Data - A Case Study. In Jean-Francois Boulicaut and Floriana Esposito and Fosca Giannotti and Dino Pedreschi (editors), PKDD '04: Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, Vol. 3202, pages 325--336, New York, NY, USA, Springer, 2004. |
Morik/Scholz/2004a | Morik, Katharina and Scholz, Martin. The MiningMart Approach to Knowledge Discovery in Databases. In Ning Zhong and Jiming Liu (editors), Intelligent Technologies for Information Analysis, pages 47--65, Springer, 2004. |
Chudzian/etal/2003a | Cezary Chudzian and Janusz Granat and Wieslaw Traczyk. Call Center Case. No. D17.2b, IST Project MiningMart, IST-11993, 2003. |
Euler/etal/2003a | Euler, Timm and Morik, Katharina and Scholz, Martin. MiningMart: Sharing Successful KDD Processes. In Hotho, Andreas and Stumme, Gerd (editors), LLWA 2003 -- Tagungsband der GI-Workshop-Woche Lehren -- Lernen -- Wissen -- Adaptivitat, pages 121--122, 2003. |
Granat/etal/2003a | Janusz Granat and Wieslaw Traczyk and Cezary Chudzian. Evaluation report by NIT. No. D17.3b, IST Project MiningMart, IST-11993, 2003. |
Morik/etal/2003a | Morik,Katharina and Scholz, Martin and Euler, Timm. MiningMart Final Report. No. D20.4, IST Project MiningMart, IST-11993, 2003. |
Morik/etal/2003b | Morik, Katharina and Scholz, Martin and Euler, Timm. Ext-MM Final Report. No. D20.5, IST Project MiningMart, IST-11993, 2003. |
Richeldi/Perucci/2003a | Marco Richeldi and Alessandro Perucci. Mining data with the MiningMart system -- Evaluation Report. No. D17.0, IST Project MiningMart, IST-11993, 2003. |
Scholz/2003a | Martin Scholz. One Day Seminar -- Data Mining In Practice. No. D11.1, IST Project MiningMart, IST-11993, 2003. |
Berka/2002a | Petr Berka. Discretization and Grouping operators. No. D16.1, IST Project MiningMart, IST-11993, 2002. |
Berka/etal/2002a | Berka, Petr and Jirousek, Radim and Pudil, Pavel. Feature Selection Operators based on Information Theoretical Measures. No. D14.4, IST Project MiningMart, IST-11993, 2002. |
Bredeche/etal/2002a | Bredeche, N. and Saitta, L. and Zucker, J.D.. A Wrapper Approach for Robot Visual Perception. In ICML Workshop on Machine Learning in Computer Vision, pages 22--30, Sydney, Australia, 2002. |
Euler/2002b | Euler, Timm. Feature Selection with Support Vector Machines. No. D14.3, IST Project MiningMart, IST-11993, 2002. |
Euler/2002c | Timm Euler. Operator Specifications. No. TR12-02, IST Project MiningMart, IST-11993, 2002. |
Euler/2002d | Timm Euler. How to implement M4 operators. No. TR12-04, IST Project MiningMart, IST-11993, 2002. |
Haustein/2002a | Stefan Haustein. Internet Presentation of MiningMart Cases. No. D9, IST Project MiningMart, IST-11993, 2002. |
Kietz/2002a | Kietz, Jorg-Uwe. On the Learnability of Description Logic. In Proc of the 12th Int. Conf. on Inductive Logic Programming, 2002. |
Kietz/2002b | Jorg-Uwe Kietz. On the Learnability of Description Logic. kdlabs AG, 2002. |
Kietz/2002c | Jorg-Uwe Kietz. On the Learnability of Description Logic Programs. No. D13, IST Project MiningMart, IST-11993, 2002. |
Laverman/Rem/2002a | Bert Laverman and Olaf Rem. Description of the M4 Interface used by the HCI of WP12. No. D12.2, IST Project MiningMart, IST-11993, 2002. |
May/Geppert/2002a | Michael May and Detlef Geppert. Description of the HCI for Pre-Processing Chains. No. D12.3, IST Project MiningMart, IST-11993, 2002. |
Portinale/Saitta/2002a | Portinale, Luigi and Saitta, Lorenza. Feature Selection. No. D14.1, IST Project MiningMart, IST-11993, 2002. |
Rem/2002a | Olaf Rem. Case Base of Preprocessing. No. D10, IST Project MiningMart, IST-11993, 2002. |
Rem/Darwinkel/2002a | Olaf Rem and Erik Darwinkel. The Concept Editor. No. D12.4, IST Project MiningMart, IST-11993, 2002. |
Rem/Trautwein/2002a | Olaf Rem and Marten Trautwein. Best practices report. No. D11.3, IST Project MiningMart, IST-11993, 2002. |
Richeldi/Perrucci/2002a | Marco Richeldi and Alessandro Perrucci. Mining Mart Evaluation Report. No. D17.3, IST Project MiningMart, IST-11993, 2002. |
Richeldi/Perrucci/2002b | Marco Richeldi and Alessandro Perrucci. Churn Analysis Case Study. No. D17.2, IST Project MiningMart, IST-11993, 2002. |
Scholz/2002b | Martin Scholz. Representing Constraints, Conditions and Assertions in M4. No. TR18-01, IST Project MiningMart, IST-11993, 2002. |
Scholz/2002c | Martin Scholz. Using Constraints, Conditions and Assertions. No. TR18-02, IST Project MiningMart, IST-11993, 2002. |
Scholz/etal/2002a | Martin Scholz and Timm Euler and Lorenza Saitta. Applicability Constraints on Learning Operators. No. D18, IST Project MiningMart, IST-11993, 2002. |
Scholz/Euler/2002a | Martin Scholz and Timm Euler. Documentation of the MiningMart Meta Model (M4). No. TR12-05, IST Project MiningMart, IST-11993, 2002. |
Bathoorn/etal/2001a | Ronnie Bathoorn, Nico Brandt, Marc de Haas and Olf Rem. Problem Modeling. No. D19, IST Project MiningMart, IST-11993, 2001. |
Brockhausen/etal/2001a | Peter Brockhausen and Marc de Haas and Jorg-Uwe Kietz and Arno Knobbe and Olaf Rem and Regina Zucker and Nico Brandt. Mining Multi-Relational Data. No. D15, IST Project MiningMart, IST-11993, 2001. |
Kietz/etal/2001a | Kietz, Jorg--Uwe and Vaduva, Anca and Zucker, Regina. MiningMart: Metadata-Driven Preprocessing. In Proceedings of the ECML/PKDD Workshop on Database Support for KDD, 2001. |
Knobbe/etal/2001a | Arno J. Knobbe and Marc de Haas and Arno Siebes. Propositionalisation and Aggregates. In Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 277--288, London, UK, Springer, 2001. |
Morik/etal/2001a | Anca Vaduva and Jorg-Uwe Kietz and Regina Zucker and Klaus R. Dittrich. M4 -- The MiningMart Meta Model. No. ifi-2001.02, Institute for Computer Science, Univ. Zurich, 2001. |
Morik/etal/2001b |
Morik, Katharina and Botta, Marco and Dittrich, Klaus R. and Kietz, Jorg-Uwe and Portinale, Luigi and Vaduva, Anca and Zucker, Regina.
M4 -- The MiningMart Meta Model.
No. D8/9,
IST Project MiningMart, IST-11993,
2001.
![]() |
Vaduva/etal/2001a | Anca Vaduva and Jorg-Uwe Kietz and Regina Zucker and Klaus R. Dittrich. M4 -- The MiningMart Meta Model. No. ifi-2001.02, Institute for Computer Science, Univ. Zurich, 2001. |
Vaduva/etal/2001b | Anca Vaduva and Jorg-Uwe Kietz and Regina Zucker and Klaus R. Dittrich. M4 a Metamodel for Data Preprocessing. In Proc. of the ACM Fourth International Workshop on Data Warehousing and OLAP (DOLAP 2001), 2001. |
Zuecker/2001a | Regina Zucker. Description of the Metadata-Compiler using the M4-Relational Metadata-Schema. No. D7b, IST Project MiningMart, IST-11993, 2001. |
Zuecker/etal/2001c | Regina Zucker. Description of the M4-Relational Metadata-Schema within the Database. No. D7a, IST Project MiningMart, IST-11993, 2001. |
Kietz/etal/2000a | Kietz, Jorg-Uwe and Vaduva, Anca and Zucker, Regina. Mining Mart: Combining Case-Based-Reasoning and Multi-Strategy Learning into a Framework to reuse KDD-Application. In R.S. Michalki and P. Brazdil (editors), Proceedings of the fifth International Workshop on Multistrategy Learning (MSL2000), Guimares, Portugal, 2000. |
Kietz/etal/2000b | Kietz, Jorg-Uwe and Fiammengo, Anna and Beccari, Giuseppe and Zucker, Regina. Data Sets, Meta-data and Preprocessing Operators at Swiss Life and CSELT. No. D6.2, IST Project MiningMart, IST-11993, 2000. |
Knobbe/etal/2000b | Arno Knobbe and Adriaan Schipper and Peter Brockhausen. Domain Knowledge and Data Mining Process Decisions. No. D5, IST Project MiningMart, IST-11993, 2000. |
Morik/2000a |
Morik, Katharina.
The Representation Race - Preprocessing for Handling Time Phenomena.
In
Ramon L\'opez de M\'antaras and Enric Plaza (editors),
Proceedings of the 11th European Conference on Machine Learning (ECML),
Vol. 1810,
pages 4--19,
Berlin, Heidelberg, New York,
Springer,
2000.
![]() |
Morik/Liedtke/2000a | Morik, Katharina and Liedtke, Harald. Learning about Time. No. D3, IST Project MiningMart, IST-11993, 2000. |
Saitta/etal/2000a | Saitta, Lorenza and Kietz, Joerg-Uwe and Beccari, Giuseppe. Specification of Pre-Processing Operators Requirements. No. D1, IST Project MiningMart, IST-11993, 2000. |
Saitta/etal/2000b | Saitta, Lorenza and Botta, Marco and Beccari, Giuseppe and Klinkenberg, Ralf. Studies in Parameter Setting. No. D4.2, IST Project MiningMart, IST-11993, 2000. |
Saitta/etal/2000c | Lorenza Saitta, Giuseppe Beccari and Alessandro Serra. Informed Parameter Setting. No. D4.1, IST Project MiningMart, IST-11993, 2000. |
Vetterli/etal/2000a | Thomas Vetterli and Anca Vaduva and Martin Staudt. Metadata Standards for Data Warehousing: Open Information Model vs. Common Warehouse Metamodel. In ACM SigMod Record, Vol. 29, No. 3, 2000. |
Wettschereck/Mueller/2000a | Wettschereck, Dietrich and Mueller, Stefan. MiningMart Deliverable D2.1. No. D2.1, IST Project MiningMart, IST-11993, 2000. |
Zuecker/Kietz/2000a | Zucker, Regina and Kietz, Jorg--Uwe. How to preprocess large databases. In Data Mining, Decision Support, Meta-learning and ILP: Forum forPractical Problem Presentation and Prospective Solutions, Lyon, France, 2000. |