News from the Artificial Intelligence Group

The chair of artificial intelligence deals with the wide field of machine learning. In particular the chair concentrates on the development and implementation of learning algorithms that solve challenging problems.

LS8-Sekretariat zum Jahreswechsel geschlossen

The LS8 secretariat will not be staffed from 24.12.2022 to 08.01.2023 inclusive. During this time, the TU Dortmund University will be completely closed.

With good wishes for a peaceful Christmas season and a confident, healthy start into the year 2023! LS8 team

Prof. Schubert ist in den Stanford Top 100.000 Forschern für 2021

The Stanford statistician John Ioannidis publishes a list of the 100,000 most influential scientists (science-wide).

In the single-year ranking of the current version (based on the data for the year 2021, published November 2022), Prof. Schubert ranks 92735.

TU Dortmund has 17 members in the top 100,000, led by Erman Tekkaya (mechanical engineering, #25235) and Oliver Kayser (biochemistry, #38430). Our rector Manfred Bayer (physics, #87172) is included as well as Boris Otto (industrial information management, #78427), another professor co-opted in Computer Science. Almost in the top 100.000 is Günter Rudolph (#105622).

The ranking is based on Elsevier's Scopus data and the composite citation index (c-score) developed by the Stanford statistician John Ioannidis. The index combines scaled citation numbers (without self-citations), h-index and hm-index, but also uses the author order. Nevertheless, any such ranking is based on design choices and data that may be biased, e.g., the Elsevier Scopus data use to be journal-oriented and not value compute science conferences as much.

The similar "career-long" ranking contains 19 members of the TU Dortmund, including the three computer scientists Günter Rudolph (#43603), Ingo Wegener (#67497) and Bernhard Steffen (#75346).

Amal Saadallah verteidigt ihre Dissertation am LS8

Amal Saadallah neben Katharina Morik, Thomas Liebig und Barbara Hammer

Amal Saadallah has defended her dissertation Explainable Adaptation of Time Series Forecasting with great praise (magna cum laude). In her work, she focuses on the online management of many models for time series forecasting, the combination of Machine Learning methods and process simulation systems, and explainable model-based quality prediction in Industry 4.0.

The members of the Ph.D. committee were Prof. Dr. Katharina Morik (supervisor and first reviewer), Prof. Dr. Barbara Hammer (second reviewer, Bielefeld University), Prof. Dr. Petra Wiederkehr (chair) and Jun.-Prof. Dr. Thomas Liebig (faculty representative). Amal Saadallah is a research associate at the LS8 and member of the Collaborative Research Center 876 (project B3).

Lukas Pfahler verteidigt seine Dissertation am LS8

Lukas Pfahler neben Katharina Morik und Andreas Hotho

Lukas Pfahler has defended his dissertation Some Representation Learning Tasks and the Inspection of Their Models with distinction (summa cum laude). In his work, he focuses on representation learning with unsupervised methods. For instance, he has investigated the use of embedding learning with graph convolutional neural networks for the search and retrieval of related mathematical expressions. Furthermore, he has worked on novel methods for model inspection to increase trust in decisions.

The members of the PhD committee were Prof. Dr. Katharina Morik (supervisor and first reviewer), Prof. Dr. Andreas Hotho (second reviewer, University of Würzburg), Prof. Dr. Jakob Rehof (chair) and Priv.-Doz. Dr. habil Frank Weichert (faculty representative). Lukas Pfahler is a research associate at LS8 and member of the Collaborative Research Center 876 (project A1).

Mirko Bunse verteidigt seine Dissertation am LS8

Mirko Bunse neben Katharina Morik, Thomas Liebig und Johannes Fischer

Mirko Bunse has defended his dissertation Machine Learning for Acquiring Knowledge in Astro-Particle Physics with great praise (magna cum laude). In his work, he studied the manifold applications of Machine Learning algorithms in Astroparticle Physics. In particular, he focuses on the smart and resource-aware control of simulations through active class selection and the domain-specific aggregation of predictions in terms of quantification and unfolding.

The members of the PhD committee were Prof. Dr. Katharina Morik (supervisor and first reviewer), Dr. Fabrizio Sebastiani (second reviewer, Consiglio Nazionale delle Ricerche, Pisa), Prof. Dr. Johannes Fischer (chair) and Jun.-Prof. Dr. Thomas Liebig (faculty representative). Mirko Bunse is a research associate at LS8, member of the Collaborative Research Center 876 (project C3) and Coordinator of the application field astroparticle physics at the Lamarr Institute for Machine Learning and Artificial Intelligence (former Competence Center ML2R).

Katharina Morik organisiert "Smart City" Session beim dritten trilateralen Symposium zur Künstlichen Intelligenz von Japan, Deutschland und Frankreich in Tokio

The 3rd trilateral AI symposium of Japan, Germany and France took place the 27th of October 2022 in Tokyo. Katharina Morik organized the session “Smart Cities” in which 2 speakers of each country presented their work. She presented the EU projects INSIGHT and VAVEL (coordinator. Dimitrios Gunopoulos), in which she participated together with Thomas Liebig. The concluding discussion stressed as most important the acquisition of mobility data and the interaction of all the stakeholders. How should mobility companies, governmental institutions, IT companies and the users cooperate?

Best Student Paper Award: Faster Silhouette Clustering

At the SISAP 2022 conference at the University of Bologna, Lars Lenssen won the "best student paper" award for the contribution "Lars Lenssen, Erich Schubert. Clustering by Direct Optimization of the Medoid Silhouette. In: Similarity Search and Applications. SISAP 2022. https://doi.org/10.1007/978-3-031-17849-8_15".

The publisher Springer donates a monetary prize for the awards, and the best contributions are invited to submit an extended version to a special issue of the A* journal "Information Systems".

In this paper, we introduce a new clustering method that directly optimizes the Medoid Silhouette, a variant of the popular Silhouette measure of clustering quality. As the new variant is O(k²) times faster than previous approaches, we can cluster data sets larger by orders of magnitude, where large values of k are desirable. The implementation is available in the Rust "kmedoids" crate and the Python module "kmedoids", the code is open source on Github.

The group is successful for the second time: In 2020, Erik Thordsen won the award with the contribution "Erik Thordsen, Erich Schubert. ABID: Angle Based Intrinsic Dimensionality. In: Similarity Search and Applications. SISAP 2020. https://doi.org/10.1007/978-3-030-60936-8_17".

This paper introduced a new angle-based estimator of the intrinsic dimensionality – a measure of local data complexity – traditionally estimated solely from distances.

Sebastian Buschjäger verteidigt seine Dissertation am LS8

Sebastian Buschjger neben Katharina Morik, Jens Teubner und Jian-Jia Chen

Sebastian Buschjäger has defended his dissertation Ensemble Learning with Discrete Classifiers on Small Devices at the Chair of Artificial Intelligence with distinction (summa cum laude). He conducted research on the topic of resource-aware machine learning in the context of the Collaborative Research Center 876, Project A1. He researched ensemble methods in the context of embedded systems. This included training as well as deploying decision forests on small devices.

The members of the PhD committee were Prof. Dr. Katharina Morik (supervisor and first reviewer), Prof. Johannes Fürnkranz (second reviewer, University of Linz), Prof. Dr. Jian-Jia Chen (chair) and Prof. Dr. Jens Teubner (faculty representative). Sebastian Buschjäger is a research associate at LS8 and a member of the Collaborative Research Center 876 (Project A1).

Vordenker Forum 2022

The technology pioneer for autonomous driving, Prof. Sebastian Thrun, was honored as "Vordenker 2022" at Goethe University Frankfurt on September 15. The former Google vice-director and Stanford professor founded the online learning platform Udacity and is now dedicated to autonomous flying. In his speech, he recognized Katharina Morik as a leading pioneer in the field of artificial learning. Prof. Thrun said she "was already the goddess of artificial learning back then. The very first one who did it in Germany and is still quite a leader today." In the panel, Prof. Morik explained how artificial intelligence can help make work more productive, safe and environmentally friendly, creating capacity for social good. The director of the Lamarr Institute for Machine Learning and Artificial Intelligence also presented Lamarr's research on intelligible communication of artificial intelligence in the form of so-called care labels.

A recording of the event is available online: 

Studentische Hilfskraft im Bereich Federated Learning gesucht (SHK / WHF) - DIE STELLE WURDE BESETZT

The Chair VIII of the Faculty of Computer Science has an immediate vacancy for a student assistant (SHK / WHF) in the field of Federated Learning. The number of hours can be discussed individually. The offer is aimed at students of computer science who have completed their studies with very good results.

You can find more information about the positions and your application here

Studentische Hilfskräfte (SHK / WHF) gesucht - DIE STELLE WURDE BESETZT

The Chair VIII of the Faculty of Computer Science has immediate vacancies for student assistants (SHK / WHF). The number of hours can be discussed individually. The offer is aimed at students of computer science who have completed their studies with very good results.  

You can find more information about the positions and your application here

Best Paper Award at the ICDM PhD Forum

We are pleased to announce that Pierre Haritz, Helena Kotthaus, Thomas Liebig and Lukas Pfahler have received the "Best Paper Award" for the paper "Self-Supervised Source Code Annotation from Related Research Papers" at the IEEE ICDM PhD Forum 2021.

To increase the understanding and reusability of third-party source code, the paper proposes a prototype tool based on BERT models. The underlying neural network learns common structures between scientific publications and their implementations based on variables occurring in the text and source code, and will be used to annotate scientific code with information from the respective publication.

(Weiter...  )

6Gem Project

6Gem LogoThe 6GEM consortium combines scientific excellence and mobile communications expertise at network, material, component/microchip, and module-level in North Rhine-Westphalia. A holistic approach is pursued, from production to logistics to people with their needs for self-determination, privacy, and security in times of climate change.

Based on previous contributions in the SFB 876, the LS8 project team will explore novel, real-time capable 6G network technologies and innovative 6G application fields. Among other things, the results will flow into the standardization of open 6G networks, open-source projects for software-defined networks, and patents.

Prof. Christian Wietfeld from the Department of Communication Networks is the spokesperson for the TU Dortmund in the 6GEM project. Also involved from the Faculty of Electrical Engineering and Information Technology are Embedded Systems, High-Frequency Technology, and Energy Efficiency. From the Faculty of Computer Science, the areas of Design Automation for Embedded Systems and Smart City Science are also involved. From the Faculty of Mechanical Engineering, the area of Materials Handling and Warehousing.

(Weiter...  )

Bericht zu reflexiver KI der Volkswagen Stiftung ist veröffentlicht

The report of the project on reflective AI, funded by the Volkswagen Stiftung, is published. It is about the user’s awareness of the implications of AI systems.

Ensuring a safe and responsible use of AI cannot be solved alone through technological innovation and regulation, in spite of their importance. Many of the problems encountered in the use of AI systems stem from the lack of personal and societal experience with AI. They mirror not only the biases and inequalities reflected in the data and AI algorithms but also those from the organizational and societal contexts in which AI is used and designed. 

View report.

Mirko Bunse und Lukas Heppe unter den Gewinnern der Ariel Machine Learning Data Challenge

Scientists Mirko Bunse (Collaborative Research Center/SFB 876) and Lukas Heppe (ML2R) took second place in the Ariel Machine Learning Data Challenge at the ECML PKDD 2021 conference. Together, they developed a multi-level Deep Learning method for analyzing noisy time series data. Using data preprocessing, they bundled information from the data set, including noise properties. This bundling of information allowed for a training of neural networks that is efficient enough to create an ensemble of 45 individual networks. The developed approach achieved an average prediction error of only three percent.

More informationen to Ariel Machine Learning Data Challenge

Stefanie Jegelka vom MIT hält Vortrag zum Thema "Learning in Graph Neural Networks"

Event date: July 15 2021 16:15

Learning in Graph Neural Networks

Abstract - Graph Neural Networks (GNNs) have become a popular tool for learning representations of graph-structured inputs, with applications in computational chemistry, recommendation, pharmacy, reasoning, and many other areas. In this talk, I will show some recent results on learning with message-passing GNNs. In particular, GNNs possess important invariances and inductive biases that affect learning and generalization. We relate these properties and the choice of the “aggregation function” to predictions within and outside the training distribution.

This talk is based on joint work with Keyulu Xu, Jingling Li, Mozhi Zhang, Simon S. Du, Ken-ichi Kawarabayashi, Vikas Garg and Tommi Jaakkola.


Short bio - Stefanie Jegelka is an Associate Professor in the Department of EECS at MIT. She is a member of the Computer Science and AI Lab (CSAIL), the Center for Statistics, and an affiliate of IDSS and the ORC. Before joining MIT, she was a postdoctoral researcher at UC Berkeley, and obtained her PhD from ETH Zurich and the Max Planck Institute for Intelligent Systems. Stefanie has received a Sloan Research Fellowship, an NSF CAREER Award, a DARPA Young Faculty Award, a Google research award, a Two Sigma faculty research award, the German Pattern Recognition Award and a Best Paper Award at the International Conference for Machine Learning (ICML). Her research interests span the theory and practice of algorithmic machine learning.


Digitaltag 2021: Kompetenzzentrum ML2R veranstaltet Hands-On Workshop zum Maschinellen Lernen

As part of the Digitaltag (Digital Day) 2021, the Competence Center Machine Learning Rhine-Ruhr (ML2R) is hosting a joint virtual hands-on workshop with the software manufacturer RapidMiner. The German-language event under the motto "Machine Learning: An Introduction to the Key Technology of Artificial Intelligence" on 18 June offers participants of all backgrounds exciting insights into the basics of Machine Learning (ML) as well as illustrative application examples. Using the graphical software RapidMiner, participants will also learn about a typical ML workflow. Based on a concrete application task, the steps of data preparation, model building, training, prediction, and validation will be explained and exemplarily executed in the software RapidMiner Studio under the guidance of an AI trainer.

Registration for the event is free of charge. To register, please write an email to ann-kathrin.oster@tu-dortmund.de. You will then receive the access data for the event as well as instructions on the free of cost installation of the "RapidMiner Studio" software, which is essential for the practical part of the workshop.

Referentin / Referent für Forschung und Wissenschaft (m/w/d) gesucht - Ref.-Nr. 083/21e

This position should be filled in the Faculty of Computer Science in Collaborative Research Center 876 (SFB 876) as soon as possible until December 31, 2022. According to the public tariff regulations, the salary is based on the tariff group E13 TV-L.

For details: https://karriere.tu-dortmund.de/job/view/810/research-and-science-officer-m-f-d-ref-no-083-21e?page_lang=en

Studentische Hilfskräfte (SHK / WHF) gesucht - DIE STELLE WURDE BESETZT

The Chair VIII of the Faculty of Computer Science has an immediate vacancy for an student assistant (SHK / WHF). The number of hours can be discussed individually. The offer is aimed at students of computer science who have completed their studies with very good results. Specifically, the job is about the implementation and further development of existing procedures as well as their evaluation on small devices.

You can find more information about the position and your application here

„She transforms IT“ stärkt Frauen in der Digitalisierung

The initiative "She transforms IT" is dedicated to empowering women and girls in IT and aims to increase their participation in digitization. To this end, the initiative works, among other things, to promote digital competence among girls, to make women in IT more visible and offer them research and teaching offers which are diverse and offer extensive networking opportunities.

Prof. Dr. Katharina Morik, Professor of Artificial Intelligence and spokesperson for the Competence Center ML2R and Collaborative Research Center 876, was among the first 50 signatories of the initiative, which was presented at the Digital Summit 2020.

Kompetenzzentrum ML2R startet Blog zu Maschinellem Lernen und Künstlicher Intelligenz

The Competence Center Machine Learning Rhine-Ruhr (ML2R) has launched its new blog: https://machinelearning-blog.de. In the categories Application, Research and Foundations, researchers of the Competence Center and renowned guest authors provide exciting insights into scientific results, interdisciplinary projects and industry-related findings surrounding Machine Learning (ML) and Artificial Intelligence (AI). The Competence Center ML2R brings forward-looking technologies and research results to companies and society.

Seven articles already await readers: a four-part series on ML-Basics as well as one article each within the sections Application, Research and Foundations. The authors illustrate why AI must be explainable, how obscured satellite images can be recovered using Machine Learning and show methods for the automated assignment of keywords for short texts.

Studentische Hilfskräfte (SHK / WHF) gesucht - die Stelle wurde besetzt

to the job Computer ScienceAt the TU Dortmund, Faculty of Computer Science, Chair VIII, there are vacancies for assistants (SHK / WHF) to be immediately taken. The number of hours can be discussed individually. The offer is aimed at students of computer science who have completed their studies with very good results so far. We offer you the opportunity to work in research and development within national and international projects.

to the job Computer Science

Studentische Hilfskräfte (SHK / WHF) gesucht - die Stelle wurde besetzt

At the Faculty of Computer Science, Chair VIII, there are vacancies for assistants (SHK / WHF) to be immediately taken. The number of hours can be discussed individually. The offer is aimed at students of the TU Dortmund and FH Dortmund who have experience in researching facts and also have web programming skills. The call for applications is explicitly not aimed exclusively at students from the Department of Computer Science, but is also open to students with other study focuses who have the relevant qualifications.

More information about the position and your application can be found here

Digital-Gipfel 2020: Digital nachhaltiger leben

How can AI be more sustainable? Prof. Dr. Katharina Morik discussed this question during a panel at the Digital Summit 2020. This year’s Digital Summit, which has been organized by the German Federal Ministry for Economic Affairs and Energy (BMWI) since 2016, addresses the central theme "Living digitally more sustainably".

In a panel discussion, Katharina Morik highlighted the resource-saving potential of Machine Learning. At the same time, she emphasized the high energy consumption of learning processes, for example when storing and training Deep Neural Networks. Prof. Morik heads the Collaborative Research Center 876 and the Competence Center ML2R, whose research is dedicated among other topics to Machine Learning under resource constraints.

Further information about the event: Digital Summit 2020

ML2R veröffentlicht Aktivitätsbericht

The competence center ML2R has published a report on the implemented activities since its founding in 2018. The report, which gives readers an overview of research and transfer projects, opportunities for collaboration and cooperation partners as well as conducted events, is available online. Since its founding in 2018, scientists of the Competence Center ML2R have been conducting research on cutting-edge technologies in the fields of Artificial Intelligence (AI) and Machine Learning (ML), promoting the transfer of research to industry and generating national as well as international visibility for AI research in Germany.

Prof. Dr. Katharina Morik, Professor of Artificial Intelligence, leads the competence center together with Prof. Dr. Stefan Wrobel (Fraunhofer IAIS) in her role as spokeswoman for the ML2R. In its newly published activity report, the ML2R presents highlights from two years of work. Find out more about exciting research and flagship projects as well as cooperation offers, gain insights into the multi-faceted network of the competence center and relive prominent ML2R events.

The report is available online in German and English.

Studentische Hilfskräfte (SHK / WHF) gesucht

At the TU Dortmund, Faculty of Computer Science, Chair VIII, there are vacancies for assistants (SHK / WHF) to be immediately taken. The number of hours can be discussed individually. The offer is aimed at students of computer science who have completed their basic studies (semesters 1-4) with good results so far. We offer you the opportunity to work in research and development within national and international projects.

to the job Computer Science



ML2R präsentiert Konzepte für vertrauenswürdiges und nachhaltiges Maschinelles Lernen

The latest event in the series “Grand Challenges: Answers from North Rhine-Westphalia” addressed the topic “AI made in NRW: Sustainability and Trust for Artificial Intelligence Technologies” and was co-organized by the Competence Center ML2R. The virtual event took place on October 29, 2020 and gave researchers from North Rhine-Westphalia (NRW) the opportunity to engage in a dialogue with EU representatives. 

Isabel Pfeiffer-Poensgen, Minister for Culture and Science NRW, opened the event, followed by a keynote speech by Eric Badiqué, consultant for Artificial Intelligence of the European Commission (Directorate General DG CONNECT). The speakers of ML2R helped to shape the program and gave impulses within the framework of Germany’s Presidency of the EU Council: Prof. Dr. Stefan Wrobel conducted the moderation and Prof. Dr. Katharina Morik gave a lecture on “Trustworthy machine learning”. In her lecture, Prof. Dr. Katharina Morik highlighted the research foci of the competence center: Research on trustworthy Machine Learning (ML) that meets sustainability standards and is designed in a way that is understandable for users.

More about the event “Grand Challenges: AI made in NRW”

The brochure accompanying the event is available online and provides an overview of the AI research landscape in NRW.

You missed the event? A recording of the livestream is available here.

European Big Data Value Forum (EBDVF) 2020

This year's EBDVF (3-5 November) brought together policy makers, experts from industry and researchers from all over Europe. Along the central theme of the event, they discussed the establishment of a European data and AI ecosystem.

In the Focus Track "Technology, Platforms and Trust", Prof. Dr. Katharina Morik, Head of the Chair of Artificial Intelligence, presented solutions and projects that aim to increase the confidence of users in AI technologies. The Competence Center Machine Learning Rhine-Ruhr (ML2R), headed by Katharina Morik, also addresses this research focus.

The European Big Data Value Forum (EBDVF) is the flagship event of the European Big Data and Data-Driven AI Research and Innovation community organized by the Big Data Value Association (BDVA) and the European Commission (DG CNECT).

Photo credit: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster ML/DFKI

Künstliche Intelligenz aus verschiedenen Perspektiven: Die 4. Dortmunder Wissenschaftskonferenz am 06.11.2020

We encounter artificial intelligence at every corner of our everyday lives - when shopping online, streaming video, exercising or looking for a partner, we let artificial intelligence give us recommendations or even leave our decisions to algorithms that know us better than we know ourselves. AI technologies also play an important role in the world of work - from intelligent factories to AI-supported recruitment. On November 6th, the 4th Dortmund Science Conference will deal with artificial intelligence from different perspectives - for the first time in digital format.

Prof. Dr. Bernhard Schölkopf, Founding Director of the Max Planck Institute for Intelligent Systems in Tübingen, an AI pioneer and award-winning researcher in the field of artificial intelligence, will give the opening lecture.In the session "AI Research made in Dortmund" Prof. Dr. Katharina Morik and other speakers from the TU Dortmund University, the Competence Center ML2R, the Fraunhofer IML and Rapidminer GmbH will discuss the following topics

"Faszination Forschung"   Prof. Dr. Katharina Morik (TU Dortmund, ML2R)
    Prof. Dr. Christian Wietfeld (TU Dortmund)
    Dr. Jens Buß (TU Dortmund)
    Sebastian Buschjäger (TU Dortmund)
    Lukas Pfahler (TU Dortmund)
"KI in der Logistik"   Prof. Dr. Dr. h.c. Michael ten Hompel (Fraunhofer IML)
    Moritz Roidl (Fraunhofer IML)
    Anike Murrenhoff (Fraunhofer IML)
"KI Praxis"   Prof. Dr. Katharina Morik (TU Dortmund, ML2R)
    Dr. Helena Kotthaus (TU Dortmund, ML2R)
    Philipp Schlunder (RapidMiner GmbH)

Further program details and the link for free registration can be found on the conference website: www.wissenschaftskonferenz.dortmund.de. A flyer can be downloaded here.

Wissenschaftliches Beiratstreffen gibt Impulse für zukünftige Forschung und Strategie des ML2R

Sensorfloor hall at TU DortrmundScientific excellence and transnational exchange characterized the meeting of the ML2R’s Steering Board, the scientific advisory board of the Competence Center ML2R. The Steering Board integrates the ML2R into a network of outstanding, world-renowned researchers in the fields of Machine Learning and Artificial Intelligence. During the three-day virtual event, which took place from September 21st to 23rd, the members of the Steering Board gained extensive insights into the work of ML2R, engaged in direct exchange with ML2R scientists and provided important impulses for the further work of ML2R in terms of strategy and visibility. In line with the motto “Fresh Off Your Desk – Share Your Thoughts”, the meeting of the ML2R’s team and the Steering Board, offered the opportunity for topical and strategic dialogue and networking.

In a concluding feedback-session, the international experts emphasized the excellence of the Competence Center and highlighted the contribution of ML2R scientists in highly relevant fields of research. They for example pointed to the ML2R’s research efforts in quantum technologies, modular Machine Learning, ML methods for understanding and processing natural language as well as ML technologies under consideration of hardware limitations.In addition, the Steering Board members set important impulses in terms of strategy and visibility for the future direction of the Competence Center.

For more information on the event and the members oft the ML2R’s Steering Board, please click here.

"Datenhelden" - Prof. Dr. Katharina Morik im Gespräch mit QuinScape

In the latest issue of "Datenhelden" Prof. Dr. Katharina Morik talks about topics related to artificial intelligence - how it is already changing our societies today and in the future, current research topics and the bias effect in machine learning. In the online format "Datenhelden" by software developer QuinScape, renowned personalities from the field of data & analytics are interviewed. The interview with Prof. Morik can be accessed here.

Virtuelle Sommerschule zum Maschinellen Lernen begeistert globale Forschergemeinde

Sensorfloor hall at TU Dortrmund

Over 830 registered participants from 64 countries: The Summer School “Resource-aware Machine Learning” was well received by the international research community. The multi-faceted program included lectures, interactive formats and opportunities for networking and getting to know one another. A hackathon on a current task from the field of logistics ran in parallel. As a conclusion and highlight of the Summer School, the finalists were able to remotely control logistics robots in a live broadcast event. For the first time, the Summer School took place as an online event and was jointly organized by the Competence Center Machine Learning Rhine-Ruhr (ML2R) and the Collaborative Research Center 876 “Providing Information by Resource-Constrained Data Analysis” at the chair of Artificial Intelligence at TU Dortmund University.

The Competence Center ML2R and the Collaborative Research Center 876 would like to thank all speakers, cooperation partners and its committed participants for a successful Summer School. For all those who could not attend the Summer School or who would like to experience some of the highlights again, selected lectures are available online.

International collaboration results in Best Paper Award

ECML 2020 Best Paper Award

At this year's ECML-PKDD, the publication "Resource-Constrained On-Device Learning By Dynamic Averaging" received a Best Paper Award at the "Workshop on Parallel, Distributed and Federated Learning". The cooperation between the TU Dortmund University, ML2R, CRC 876, the University of Bonn, Fraunhofer IAIS, and Monash University was initiated by Katharina Morik's research stay in Melbourne.

The paper demonstrated that distributed learning of probabilistic graphical models can be realized completely with integer arithmetic. This results in reduced bandwidth requirements and energy consumption, thus enabling the use of distributed models on resource-constrained hardware. Furthermore, the possible error of the approximation was theoretically analyzed and bounded.

(Weiter...  )

Prof. Dr. Katharina Morik als Beiratsmitglied der ISI Foundation ernannt

Prof. Dr. Katharina Morik was appointed as a member of the ISI Foundation’s advisory board. The Italian-based ISI Foundation conducts international research in the field of complex systems science. Dr. Francesco Bonchi, scientific director of the ISI Foundation, supports the Competence Center Machine Learning Rhine-Ruhr (ML2R), led by Prof. Morik, as a member of the scientific Steering Board.

The ML2R is one of the six German competence centers for Artificial Intelligence. In its work, the competence center benefits from the network it spans. Two advisory boards as well as numerous supporters and cooperation partners hereby complement the innovative research and business environment at the competence center's locations in Dortmund, Bonn and Sankt Augustin.

Einladung zur Internationalen Sommerschule

Together with the Collaborative Research Center 876 “Providing Information by Resource-Constrained Data Analysis" and the Competence Center Machine Learning Rhine-Ruhr (ML2R), the Artificial Intelligence Group at TU Dortmund University will host an international Summer School. The free online event will take place between August 31 and September 4, 2020.

Sensorfloor hall at TU Dortrmund The Summer School brings together experts from the research fields of Data Analysis (Machine Learning, Data Mining, Statistics) and Embedded Systems (Cyber-Physical Systems). In their lectures, they will address the resource limitations of devices in the context of Machine Learning and data analysis. Participating doctoral and post-doctoral students will also have the opportunity to present their research in the dedicated Student Corner.

The Summer School will be accompanied by a hackathon in the form of a Kaggle Challenge, in which participants can test their knowledge of Machine Learning and Cyber-Physical Systems. In a warehouse scenario, participants will make position predictions for robots, using sensor data. The winners of the challenge will then have the opportunity to control the robots used to transport goods in a live session. Moreover, they will be invited for further research cooperation to Dortmund.

Further information about the Summer School and registration

Manifest für eine dezentrale, die Privatheit achtende Corona App

An international group of scientists from KDD and ML have written a manifesto for an app that helps COVID-19 containment.

Data-driven predictions for COVID-19 in Germany

Graph of a COVID19 Prognose showing March and April We attempt to model the Corona virus infection numbers in Germany with a data-driven analysis approach.

Are our interventions helping to flatten the curve? Can we reopen university in a few weeks for examinations? And what effect do weekends and delayed reporting have on the accuracy of our data?

You are welcome to explore our interactive charts on COVID-19 predictions for Germany.

(Weiter...  )

Wissenschaftliche Mitarbeiterin / Wissenschaftlicher Mitarbeiter gesucht

With one of the four German competence centers for machine learning, ML Rhine-Ruhr, there is a unique opportunity to participate in shaping the future together with a team of top researchers. The position is part of the Competence Center ML2R at the Faculty of Computer Science and is initially limited until 31.12.2022. An extension is possible if the center is continued. 

to the job: Research Assistant_ML2R

Katharina Morik fordert zusätzliche Professuren für KI-Kompetenzzentren

In the latest issue of the Handelsblatt Journal "Artificial Intelligence", Prof. Dr. Katharina Morik addresses the question: How do we achieve AI excellence in Germany? In a guest article, Morik calls for the for the strengthening of German AI research through additional professorships. According to Morik, this is the only way to continuously strengthen strong and internationally visible research centers.

The full guest article (in German) by Professor Morik can be viewed here.

Plattform Lernende Systeme: Katharina Morik im Format "3 Fragen an"

Prof. Dr. Katharina Morik has emphasized the need for European cooperation in the research on Artificial Intelligence. In the format "3 Fragen an" (transl.: "3 Questions for") of the German AI platform "Lernende Systeme", she highlighted the strong international standing of the German AI research landscape. According to Morik, long-term cooperation between European countries is now needed. The cooperation with France through competence centers in both countries serves as a positive example.

The full interview with Katharina Morik can be read here. Professor Morik heads the working group "Technological Enablers and Data Science" of the platform "Lernende Systeme" and is speaker of the Competence Center Machine Learning Rhine-Ruhr (ML2R).

Katharina Morik gives keynote speech in Strasbourg

Prof. Dr. Katharina Morik called for long-term investment guarantees for ML-competence centres in the context of a keynote speech in Strasbourg. She spoke at the third meeting of the German-French working group "Disruptive Innovations and Artificial Intelligence" (AG DIKI) in the European Parliament. As a pioneer of Machine Learning, Katharina Morik emphasized the role that her cooperation with Yves Kodratoff from the Université Paris-Sud had played in the early years of the field. "Cooperation with France has built the European community of researchers in Machine Learning since 1986. The resulting ECML PKDD conference now covers all European countries and had 800 participants last year."

At a first meeting of the French and German Centres for AI and Machine Learning, organised by Katharina Morik and Bertrand Braunschweig last year, the thematic priorities of the centres were presented. A further meeting will take place this year to further the networking of German and French research. "What we are still lacking is the consolidation of the competence centres in Germany in order to enable long-term perspectives," said Professor Morik.


(Weiter...  )

Federal funding of the ML2R will be increased by 8 million Euros

The Federal Ministry of Education and Research (BMBF) has granted an additional 8 million Euros in federal funding for the ML2R. The funds will be available for the remainder of the first project phase until the end of 2022. This will enable the competence center to extend its research profile and among other things create up to 25 new positions for researchers who will strengthen future scientific endeavors. “This is an important sign for the long-term consolidation of the ML2R”, says Prof. Dr. Katharina Morik, speaker of the ML2R and coordinator of the German centers for Machine Learning. “The German government is showing its willingness to increase funding for the research on Machine Learning (ML) as a key technology and to strengthen the Rhine-Ruhr area as a hot-spot for ML research.”

As a result of the budget increase, the existing research efforts will be intensified and the research profile of the competence center will be extended to include the fields of Machine Learning on quantum computers and trustworthy Machine Learning algorithms as core areas of research. In addition, the ML2R will install a virtual ML showroom which will provide resources concerning the field of Machine Learning at no charge.

(Weiter...  )

Retrospective of the Global Forum on AI for Humanity

The Program Committee was delighted to welcome you at the Global Forum on AI for Humanity, which took place in Paris last October.

For three days, some 400 attendees debated on AI’s human and scientific challenges: French President Emmanuel Macron delivered a speech on France’s position on AI research; the Minister of Higher Education, Research and Innovation, Frédérique Vidal, as well as the Secretary of State for the Digital Sector, Cédric O, spoke to the audience. Altogether, more than 150 international speakers shared their views at round-table discussion.

Relive the event with the video retrospective and the photos of the forum.
All the plenary sessions have been recorded, and are available on the dedicated playlist.

Katharina Morik is co-organizer of the Global Forum on AI for Humanity

Global Forum on AI for humanity 2019

Prof. Katharina Morik works closely with Prof. Bertrand Braunschweig of INRIA as coordinator of the German competence centres for AI. Together with other renowned experts from Australia, Germany, the Netherlands, England, France, Japan, Canada and the USA, she prepared the Global Forum on AI for Humanity (GFAIH), which took place from 28 to 30 October 2019 in Paris. It served the preparation of a Global Partnership for AI (GPAI), as it was decided at the last G7 summit. The meeting in Paris was addressed by President Emmanuel Macron and served as a formal launch for GPAI and to formulate the future agenda of the GPAI working groups. The GFAIH brought together experts from AI, social sciences, humanities and engineering, as well as innovators, economic actors, policy makers and representatives of civil society in order to:

  • achieve a common understanding of the perspectives and challenges created by AI and the methods and instruments to overcome them.
  • reflect on projects, studies and social experiments that can lead to a common body of knowledge and to shape R&D agendas.
  • thoughts on Global Partnership initiatives to harness the progress of AI for the benefit of humanity.

The GFAIH was organised under the auspices of the French government.

Click here to open the program.

(Weiter...  )

Katharina Morik appointed GI Fellow

Katharina Morik appointed GI-Fellow

On 01 October 2019, the Gesellschaft für Informatik e.V. (German Informatics Society) awarded the GI-Fellowship to computer scientists who have contributed outstanding results to computer science and the GI. With Prof. Oliver Günther, PhD, Prof. Dr. Guido Herrtwich, Prof. Dr. Katharina Morik and Prof. Dr. Günter Müller, the Gesellschaft für Informatik honors four outstanding personalities this year. The award took place at the GI Annual Conference DATA PROCESSING 2019 in Kassel.

Image: Uni-Kassel/Nicolas Wefers

(Weiter...  )

ETMLP 2020: International Workshop on Explainability for Trustworthy ML Pipelines

Machine learning (ML) is a driving force for many successful applications in Artificial Intelligence. ML pipelines ensure guarantees on the entirety of the system (i.e., horizontal certification) as well as on each single component (i.e., vertical certification). The horizontal certification covers the full pipeline from data acquisition to data visualization. Moreover, it spans over user-centered, technical, financial, and regulatory aspects of the system. The vertical certification exploits the theory of ML to guarantee error bounds, sampling complexity, energy consumption, execution time, time-to-think, and memory and communication demands. The understandability of an ML pipeline in its entirety requires the collaboration of researchers from the database and the ML communities.

ETMLP workshop will examine the aforementioned opportunities and their associated challenges. The main objective of this workshop is to create a forum where researchers from machine learning, data management, and practitioners engage with ideas around explainability and certified trustworthiness of ML pipelines, at the pipeline level, as well as the component level.

The ultimate goal of the workshop is to discuss recommendations for further work in science and industry and society regarding explainable ML pipelines.

30 March 2020, Copenhagen, Denmark
Submission deadline: December 20, 2019

Prof. Dr. Katharina Morik is part of the committee.

(Weiter...  )

Machine learning in Germany and France: First joint meeting of all German and French competence centres in Würzburg

Meeting between german and french ML competence centres in Wrzburg

On the fringes of this year's ECML-PKDD, the leading European conference for machine learning and data mining, representatives of all German and French ML competence centers met in Würzburg on Monday, September 16, 2019. The common objective was the concretisation of a virtual German-French centre for the cooperation of the competence centres of both countries and the concretisation of a Memorandum of Understanding (MoU) as an agreement of this cooperation. The meeting was organised by Katharina Morik (TU Dortmund), the coordinator of the six German competence centres, and Bertrand Braunschweig (INRIA), who coordinates the four French competence centres.

(Weiter...  )

Event: Day of Artificial Intelligence

On 29.10.2019, the Artificial Intelligence theme day will take place within the North Rhine-Westphalian Academy of Science and the Arts. The focus is on machine learning as the key to artificial intelligence.

(Weiter...  )

Nico Piatkowski received Reviewer Award

Nico Piatkowski received the Reviewer Award of the ECML PKDD journal track. The award is given to 15 reviewers, who stood out in terms of reviewing load, quality and timely completion.

(Weiter...  )

Participation in the Digital Strategy of North Rhine-Westphalia

Participation in the digital strategy of North Rhine-Westphalia


The second event as part of the dialogue and participation measure for the Digital Strategy of North Rhine-Westphalia attracted more than 100 visitors to the Düsseldorf Media Harbour. The participants were able to exchange information on the subject of "DATA - the key to AI", with a particular focus on access and usability of data.

(Weiter...  )

Veranstaltung: DATEN - der Schlüssel zur KI

Northrhine-Westfalia wants to become a leader in AI. For training, many high quality data sets are required so that artificial intelligence comes to being. The questions of how data are made available und acessible will be discusssed the 3rd September 2019 in Düsseldorf. Among others, there wil be the talk:


Daten und was maschinelles Lernen daraus macht
                 Prof. Dr. Katharina Morik, Technische Universität Dortmund,
                 Lehrstuhl für Künstliche Intelligenz


Further information here.

Federal Research Minister Karliczek gained insights into machine learning

Anja Karliczek at ML2R

Anja Karliczek, Federal Minister of Education and Research, visited the Competence Center Machine Learning Rhine-Ruhr (ML2R) together with journalists on 9 July. The Minister took the opportunity to experience practical applications of artificial intelligence and machine learning live and to try them out for herself: She met robots that make AI and ML comprehensible in a playful way, discovered AI systems that analyse spoken language, improve satellite images and make autonomous driving safer, and a swarm of drones buzzed over her. This gave the Minister impressions of outstanding projects funded by the Federal Ministry of Education and Research (BMBF) as part of the ML2R.

(Weiter...  )

Amal Saadallah ist unter den Finalisten für den europäischen Data Science und KI Preis

DatSci 2019 finalists

Amal Saadallah has been selected as finalist at The European DatSci & AI Awards - Celebrating & Connecting Data Science Talent, category "Best Data Science Student of the Year". Amal works for  the Research Project B3 "Data Mining in Sensor Data of Automated Processes" within the Collaborative Research Center 876. 

The Data Science Award 2019  competition is open to individuals and teams working in the Data Science Ecosystem across Europe and is a unique opportunity to showcase research and application of Data Science/AI.

Sibylle Hess verteidigt Ihre Promotion am LS8

Sibylle Hess next to Katharina MorikSibylle Hess has successfully defended her dissertation A Mathematical Theory of Making Hard Decisions: Model Selection and Robustness of Matrix Factorization with Binary Constraints at the Chair of Artificial Intelligence. She developed new methodologies for two branches of clustering: the one concerns the derivation of nonconvex clusters, known as spectral clustering; the other addresses the identification of biclusters, a set of samples together with similarity defining features, via Boolean matrix factorization. 

The members of the doctoral committee were Prof. Dr. Katharina Morik (supervisor and first examiner), Prof. Dr. Arno Siebes (second examiner, University of Utrecht) and Prof. Dr. Erich Schubert (representative of the faculty). Sibylle Hess was a research assistant at LS8, a member of the Collaborative Research Center 876 (Project C1) and now works as a postdoctoral fellow at the TU Eindhoven.

KI made in Germany

Annual conference of the Plattform Learning Systems in Berlin

The annual conference of the Platform Learning Systems on 3 July 2019 in Berlin was opened by Federal Minister Karliczek. Of course, Katharina Morik, head of AG 1 "Technological pioneer" and coordinator of the competence centres for machine learning, was also present.

Over 100 experts meet to exchange ideas on the subject of machine learning

Meeting of experts from competence centres for machine learning at TU Dortmund

On Wednesday, 5 June, representatives of the four German competence centres for machine learning as well as experts from industry, business and science met for the first time at the TU Dortmund for a joint conference. This was organized by the Competence Center Machine Learning Rhine-Ruhr (ML2R).

Photo: Oliver Schaper

(Weiter...  )

1000 Participants, 54 Speakers and a Day Full of Innovation

Katharina Morik gives the keynote AI and the sciences

The 5th Digital Future Science Match brought together AI experts from science, industry and politics to answer the question: What’s Next in Artificial Intelligence? Katharina Morik gave the keynote “AI and the sciences”.

(Weiter...  )

Dr. Nico Piatkowski erhält Dissertationspreis der TU Dortmund

Dr. Nico Piatkowski received dissertation prize at TU Dortmund

Dr. Nico Piatkowski completed his doctorate on "Exponential Families on Resource-Constrained Systems" with distinction (summa cum laude). In addition, he received one of the dissertation prizes at the TU Dortmund University's academic anniversary celebration on January 23, 2019.

In his doctoral thesis Nico Piatkowski dealt with machine learning under limited resources. He investigated how mathematical methods of machine learning can be simplified so that they also work on devices with limited computing power, storage capacity or energy reserves. These include mobile devices such as smartphones or sensors. The scientist studied computer science and economics at the TU Dortmund University and now continues to work as a postdoc at ML2R.

Studentische Hilfskräfte (SHK / WHF) gesucht

At the TU Dortmund, Faculty of Computer Science, Chair VIII, there are several vacancies for assistants (SHK / WHF) to be filled with immediately. The number of hours can be discussed individually. The offer is aimed at students of computer science who have completed their studies with very good results so far. We offer you the opportunity to work in research and development within the framework of national and international projects.

to the job


Eröffnung des Kompetenzzentrums Maschinelles Lernen Rhein-Ruhr (ML2R)

ML2R Logo

The Competence Center Machine Learning Rhine-Ruhr (ML2R) will open in Dortmund on 23 January 2019. ML2R is one of four nationwide centres for artificial intelligence and machine learning: it establishes cutting-edge research, promotes the next generation of scientists and strengthens technology transfer in companies.

The ML2R connects pioneer institutions of ML research in Germany: the Faculty of Computer Science at the Technical University of Dortmund, the Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS in Sankt Augustin, the University of Bonn and the Fraunhofer Institute for Material Flow and Logistics IML in Dortmund. The close integration of basic and applied research forms the ideal basis for innovations.

Professor (W3) in "Interactive Data Science"

The Department of Computer Science at TU Dortmund University is seeking to fill the position of a Professor (W3) in "Interactive Data Science" commencing as soon as possible. The successful candidate will represent the field of "Interactive Data Science" in research and education.

With 6,200 employees in research, teaching, and administration and its unique profile, TU Dortmund University shapes prospects for the future. The interaction between engineering and natural science as well as social and cultural studies drives both technological innovations and progress in knowledge and methodology. It is not only the roughly 34,500 students who benefit from this. The Department of Computer Science at TU Dortmund University is one of the largest in Germany, with particular strengths in research. Among similar institutions it is distinguished by a combination of fundamental research on formal methods with the development of practical applications. Research topics are Data Science, Algorithmics, Cyber-Physical Systems as well as Software and Service Engineering.


The LS8 secretary's office will not be staffed between 19.12.2018 and 04.01.2019.

Digicon 2018 – Heute lernen, die Zukunft zu gestalten


More than 400 top decision-makers from over 150 companies such as Allianz, BOSCH, Munich Airport, Generali, Google and many more are expected at Digicon 2018. International experts from business and science will present the latest trends, developments and results in the field of machine learning. End-users will talk about their success stories and analysts about the underlying methods. This year, Prof. Dr. Katharina Morik will give a presentation on "Machine Learning and Data Mining - From Theory to Practice".

Dortmund Data Science Center (DoDSc) feierlich eröffnet


For many scientists at the TU Dortmund, handling large amounts of data is a fundamental part of their daily work. With the Collaborative Research Centre 876 ("Providing Information by Resource-Constrained Data Analysis", speaker: Prof. Dr. Morik) and the Collaborative Research Centre 823 ("Statistics of Nonlinear Dynamic Processes", speaker: Prof. Dr. Krämer), extensive research projects have already been set up at the TU Dortmund that focus on the analysis of large amounts of data. In addition, only recently, together with the University of Bonn and the Fraunhofer Institutes for Material Flow and Logistics (IML) and for Intelligent Analysis and Information Systems (IAIS), one of four competence centers for machine learning in Germany was acquired, which additionally underlines the profile area "data analysis, modelling and simulation" of the TU Dortmund.

This interdisciplinary expertise in data analysis will now be bundled in the Dortmund Data Science Center (DoDSc), to which the Faculties of Statistics, Computer Science, Mathematics and Physics belong. At the opening ceremony on 24 October, scientific lectures were held by Kevin Kröninger (Faculty of Physics, TU Dortmund) and Thomas Lengauer (Max Planck Institute for Computer Science, Saarbrücken), who pointed out further research perspectives for data science. Florian Kruse of Pont 8 was on the program for commercial applications. In addition, Trevor Hastie from Stanford University sent congratulations on the opening of the Dortmund Data Science Center.

News article of the TU Dortmund on the opening of the DoDSc (in German)


Photo: TU Dortmund/Martina Hengesbach

Data - who owns it, who stores it, who can access it?


Big data, data science, machine learning - terms that refer to data and its value for many applications. Who benefits from the data? Who may use it? How are scientific progress, success in economic competition and protection of privacy achieved simultaneously? This volume presents the papers presented at a conference of the North Rhine-Westphalian Academy of Sciences and Arts, with contributions from the fields of computer science, statistics, medicine, engineering, law and economics. In this way, the Academy participates in the urgently needed discussion on how to meet the challenges posed by today's possibilities of data collection and use.

(Weiter...  )

Stellenausschreibung Sachbearbeiterin

Am Lehrstuhl für künstliche Intelligenz ist eine Stelle als Sachbearbeiterin im Sekretariat ausgeschrieben.

(Weiter...  )

Competence Center for Machine Learning Rhine-Ruhr will be launched — we are hiring!

Logo Competence Centre

Machine learning is the basis of the digital transformation. Hence, internationally outstanding research and effective transfer into applications is of the utmost importance for a society. Germany and France aim at a collaboration in machine learning research. In this context, the Competence Centre for Machine Learning Rhine-Ruhr (ML2R), funded by the Federal Ministry of Education and Research (BMBF), is now being launched in Dortmund and Bonn/Sankt Augustin.

(Weiter...  )

29.5.2018 — Bundeskanzlerin Merkel spricht mit Experten über Künstliche Intelligenz

group photo with Chancellor Merkel

Katharina Morik was invited as an expert in machine learning, co-chairing a group in the German Platform for Learning Systems and head of the Competence Center for Machine Learning Rhein Ruhr.

28th of May at the Federal Chancellery, Angela Merkel was talking with experts on artificial intelligence.
Artificial intelligence is one of the central technologies of the future and currently one of the
biggest drivers of digitalization. 
The Chancellor discussed the potentials and challenges of artificial intelligence for
Germany. The Federal Government intends to bundle all measures in this area and combine
them into a national strategy to promote the use of artificial intelligence for the benefit of
the economy and society.

Merkel invited experts from universities, research institutions, and companies. In addition to her,
the Federal Government was represented by the Head of the Federal Chancellery, the Federal Ministers
of Education and Research, Economics and Energy, Labour and Social Affairs, Transport and
Digital Infrastructure, and by the Federal Government Commissioner for Digitalization.

The conversation was not public.

Photo: Federal Government / Jochen Eckel

Internationale Konferenz: Das nächste Level der Mobilität: Automatisierung, multimodale Services und die Rolle von Daten

Das nchste Level der Mobilitt Poster

The integrated acquisition and evaluation of data influences our day to day lives, particularly with respect to traffic and mobility. Digital technologies are used to control traffic, traffic infrastructures as well as overall traffic flow. As a result, increasingly specific products can be developed. However, data protection is always a major concern when dealing with data. These concerns will be addressed at the international conference in Dortmund on the 28th of May, 2018.

Conference agenda (in German)

Nico Piatkowski verteidigt seine Promotion am LS8

Nico Piatkowski

Nico Piatkowski has successfully defended his dissertation “Exponential Families on Resource-Constrained Systems” with an overall grade of summa cum laude. The committee members were Prof. Jens Teubner (chair, TU Dortmund), Prof. Katharina Morik (supervisor, TU Dortmund), Prof. Stefano Ermon (reviewer, Stanford University), Prof. Jakob Rehof (TU Dortmund).

Assistant Professor (W1) in Smart City Science

With more than 6,200 employees in research, teaching and administration and its unique profile, TU Dortmund University shapes prospects for the future: The cooperation between engineering and natural sciences as well as social and cultural studies promotes both technological innovations and progress in knowledge and methodology. And it is not only the more than 34,600 students who benefit from that. The Faculty for Computer Science at TU Dortmund University, Germany, is looking for a Assistant Professor(W1) in Smart City Science specialize in research and teaching in the field of Smart City Science with methodological focus in computer science (e.g. machine learning and/or algorithm design) and applications in the area of Smart Cities (e.g. traffic prediction, intelligent routing, entertainment, e-government or privacy).

Applicants profile:

  • An outstanding dissertation and excellent internationally recognized publications in the field of computer science methods for Smart Cities
  • Experience in raising third-party funding
  • The willingness to participate in research collaborations within and outside TU Dortmund University, such as CRC 876 "Availability of information through analysis under resource constraints"
  • Language competence in German or English are required
  • Appropiate participation in teaching in the faculty's courses of study

The TU Dortmund University aims at increasing the percentage of women in academic positions in the Department of Computer Science and strongly encourages women to apply. Disabled candidates with equal qualifications will be given preference.

(Weiter...  )

Lecture "Große Daten, Kleine Geräte" ("Big Data, Small Devices") in the Science Notes

Science Notes Poster

Intelligent fabrics, fitness wristbands, smartphones, cars, factories, and large scientific experiments are recording tremendous data streams. Machine Learning can harness these masses of data, but storing, communicating, and analysing them spends lots of energy. Therefore, small devices should send less, but more meaningful data to a central processor where additional analyses are performed.

(Weiter...  )

Katharina Morik among the leaders of Germany's "Platform Learning Systems"

WG leaders

Germany ranks among the pioneers in the field of learning systems and Artificial Intelligence. The aim of the Plattform Lernende Systeme initiated by the Federal Ministry of Education and Research is to promote the shaping of Learning Systems for the benefit of individuals, society and the economy. Learning Systems will improve people’s quality of life, strengthen good work performance, secure growth and prosperity and promote the sustainability of the economy, transport systems and energy supply.

(Weiter...  )

Frohe Weihnachten und ein frohes neues Jahr 2018

Christmas 2017

We wish you a merry christmas and a happy new year.

We used the Deep Visualization Toolbox of Yosinski for creating a nice picture of the visit of the three holy kings.

Best Paper Award of the International Conference on Spatial Information Theory (COSIT) 2017

Best Paper Award of the International Conference on Spatial Information Theory (COSIT) 2017

The joint work "On Avoiding Traffic Jams with Dynamic Self-Organizing Trip Planning" of Thomas Liebig and Maurice Sotzny received the Best Paper Award of the International Conference on Spatial Information Theory (COSIT) 2017.

Offene Professur(W2) am LS8

TU Dortmund University is seeking an outstanding scientist in the field of data mining of large datasets with a current research perspective and publications in high-ranked international venues. Applicants should complement the research activities of the Faculty for computer science and contribute to interdisciplinary collaborative research projects, especially the collaborative research centre CRC 876 “Providing Information by Resource-Constrained Data Analysis“.

Further information is given in the linked document

Einführung in das maschinelle Lernen für Anwender und die Öffentlichkeit

The Academy of Engineering has presented an online course on machine learning at CeBIT: http://www.acatech.de/de/projekte/projekte/mooc-maschinelles-lernen.html

After an overview presented by Prof. Dr. Stefan Wrobel (Fraunhofer St. Augustin), Katharina Morik introduces two basic methods with application examples from her many years of practical experience: the support vector machine (SVM) and decision trees. Kristian Kersting presents probabilistic graphical models.

Klassifikation und Regression - Stützvektormethode (Classification and Regression - SVM)

Download 120 MB [mp4]
Source: acatech

Klassifikation und Regression - Entscheidungsbäume (Classification and Regression - Decision Trees)

Download 87 MB [mp4]
Source: acatech

Probabilistische Graphische Modelle (Probabilistic Graphical Models)

Download 86 MB [mp4]
Source: acatech

Lukas Pfahler als einer der Jahrgangsbesten der TU Dortmund mit Hans-Uhde-Preis ausgezeichnet

Award Recipients

Four graduates of TU Dortmund received the Hans-Uhde Award for their outstanding theses. Niklas Haarmann (Faculty of Bio- and Chemical Engineering), Chris Kittle (Faculty of Electrical Engineering and Information Technology) and Lukas Pfahler (Faculty of Computer Science) achieved a master's degree and graduated as valedictorians. Christian Gehring (Faculty of Mechanical Engineering) received a grade of 1,0 for his bachelor's thesis. Additionally, three graduates of FH Dortmund and one employee of Uhde Inventa-Fischer GmbH were decorated by the Hans-Uhde Foundation.

The graduates of TU Dortmund were awarded a golden coin, a certificate and a monetary price by Guido Baranowsky, chairman of the Hans-Uhde foundation. In his thesis, Lukas Pfahler explored the question how to enable computers to learn German grammar. The ceremony took place at thyssenkrupp Industion Solutions AG in Dortmund. The ceremonial address — "Precision Medicine and Foundational Research; Innovation with Potential" — was delivered by Prof. Daniel Rauh. The goal of the Hans-Uhde Foundation is to promote Science, Schooling and Education. This is why it annually decorates outstanding students as well as pupuils. The award ceremony was attended by both Hans and Roswitha Uhde until 2011, when Friederich Uhde passed. The widowed Roswitha Uhde continued to attend the ceremonies up until her passing in 2017.


Hans-Uhde Award 2017

Lukas Pfahler, M.Sc.
TU Dortmund, Faculty of Computer Science
Master's Thesis: Explicit and Implicit Feature Maps for Structured Output Prediction


Marco Stolpe verteidigt seine Promotion am LS8

Marco Stolpes Disputation

Marco Stolpe has successfully defended his dissertation “Distributed Analysis of Vertically Partitioned Sensor Measurements under Communication Constraints”. His thesis was supervised by Katharina Morik and can be summarized (in German) as follows:

Schwerpunkt der Arbeit ist die verteilte Analyse großer Mengen vertikal partitionierter Sensordaten unter Berücksichtigung von Kommunikationsbeschränkungen. Hierbei hängt die vorherzusagende Zielgröße jeweils von an unterschiedlichen Knoten im Netzwerk gespeicherten Merkmalswerten ab. Das Szenario hat vielfältige Anwendungen im Kontext des Internet of Things und Industrie 4.0, wie etwa die Vorhersage der finalen Produktqualität anhand von an verschiedenen Bearbeitungsstationen erfassten Prozessparametern, die Vorhersage des Gesamtstromverbrauchs anhand des zuvor erfassten Verhaltens unterschiedlicher Stromabnehmer im Smart Grid oder die Vorhersage von Verkehrsflüssen in Smart Cities. Das Szenario erweist sich als besonders herausfordernd in Fällen, in denen Kommunikation oder Energie zu beschränkt sind, um alle Daten zu zentralisieren, da bereits für die Vorhersage Daten unterschiedlicher Knoten zusammengeführt werden müssen. In der Dissertation werden, motiviert durch eine Fallstudie zur Qualitätsvorhersage in verketteten Produktionsprozessen in der Stahlindustrie, kommunikationseffiziente Algorithmen für drei unterschiedliche Problemstellungen der verteilten Datenanalyse entwickelt: (1) Die lokale Reduktion von Messwerten unmittelbar dort, wo sie erfasst werden (also noch vor ihrer Übertragung), (2) die Reduktion von Messwerten, die zwischen lokalen Knoten und einem zentralen Koordinator übertragen werden und (3) die Reduktion von Informationen über vorherzusagende Zielgrößen, die zwischen Knoten übertragen werden. Die Algorithmen reduzieren die übertragene Datenmenge im Vergleich zur Übermittlung aller Daten in einem Netzwerk jeweils um ca. eine Größenordnung, bei ähnlicher Vorhersagegüte. Algorithmus (3) basiert wiederum auf einem neu entwickelten Ansatz für das relativ neuartige Problem des Lernens aus Label-Verhältnissen, dessen Lösung weitere Anwendungen im Kontext von Industrie 4.0 erschließt.


Christian Pölitz verteidigt seine Promotion am LS8

Christian Plitzs Disputation

Christian Pölitz has successfully defended his dissertation “Automatic Methods to Extract Latent Meanings in Large Text Corpora”. His thesis was supervised by Katharina Morik and can be summarized as follows:

This thesis concentrates on Data Mining in Corpus Linguistic. We show the use of modern Data Mining by developing efficient and effective methods for research and teaching in Corpus Linguistics in the fields of lexicography and semantics. Modern language resources as they are provided by Common Language Resources and Technology Infrastructure (http://clarin.eu) offer a large number of heterogeneous information resources of written language. Besides large text corpora, additional information about the sources or publication date of the documents from the corpora are available. Further, information about words from dictionaries or WordNets offer prior information of the word distributions. Starting with pre-studies in lexicography and semantics with large text corpora, we investigate the use of latent variable methods to extract hidden concepts in large text collections. We show that these hidden concepts correspond to meanings of words and subjects in text collections. This motivates an investigation of latent variable methods for large corpora to support linguistic research.


Frohe Weihnachten und ein frohes neues Jahr

Christmas 2016

The secretary's office is not occupied between December 19th, 2016 and January 6th, 2017. We wish you a merry Christmas and a happy New Year!

BMVI Data-Run

On December 2nd and 3rd, the Federal Ministry of Transport and Digital Infrastructure (BMVI) hosted the second BMVI Data-Run, this time with the theme "Realtime Data in Traffic". Over the course of two days, attending teams worked on creating innovative mobility solutions based on the provided data.

Sebastian Peter and Philipp Honysz from LS8 participated with the idea of creating an app that would help commuters compensate for traffic problems. They implemented an Android app which analyses the user's commute and notifies them of impending problems, such as overloaded bicycle stations. Additionally it uses a Google API to compute routes for common means of transportation.

Wissenschaftliche Mitarbeiter zwischen Nachrichten, Weltraum und Wissenschaft

This summer, three of our graduate studetns were between news, space and science. They were at Google, NASA, Stanford and the Wirtschaftswoche. While it was certainly not a walk in the park, it was definitely an experience and a great success. Congratulation! 

Elena Erdmann received a Google News Lab Fellowship and worked two months at the Wirtschaftswoche. She has developed both journalistic know-how and technical skills to drive innovation in digital and data journalism. Nico Piatkowski visited Stefano Ermon at Stanford University. Together they worked on techniques for scalable and exact inference in graphical models. He also made a detour to NASA. Last but not least, Martin Mladenov got an internship at Google. Some people say this is more difficult than getting admitted to Stanford or Harvard. Who knows? But this year they accepted about 2% of applicants (1,600 people). What did he work on? We do not know it, but he visited Craig Boutilier, so very likely something related to making decisions under uncertainty.

Health: Smart Data & Data Analytics


The kick-off of the "Smart Data & Data Analytics" department of CPS.HUB took place on 23rd of November 2016 at the Leibniz Institute for Analytical Sciences (ISAS). This session focused on a variety of aspects of data and data analysis in the context of health and health economy.

After the introduction by Monika Gatzke, the topic of "health" with regard to Smart Data was further discussed:

  • Prof. Dr. Katharina Morik (Head of the Department) gave an overview and presented a detailed application in intensive care medicine.
  • Sven Löffler from T-Systems spoke about Smart Data potentials in the health care sector using the example of self-tracking data.
  • Dr. Wolfgang Thronicke of Atos C-LAB presented Big Dependable Systems. These are systems that consist of different interdependent subsystems and are the object of the project Medolution.
  • The founder of the Quantified Self Movement in Germany, Florian Schumacher, spoke about the potential for Big Data Analytics.
  • Philip Potratz from the Cluster InnovativeMedizin.NRW presented the project Smart.Health.Data.
(Weiter...  )

relNet Opening Workshop

The project partners of LS8, Dortmund and CERES, Bochum hosted an opening workshop for their new joint project relNet on "Modelling Topics and Structures in Religious Online Communication" in Bochum on May 23-24. The goal of this project is to apply methods of data analytics, network analysis and text mining to analyse how digital communication has changed religious communities and the social roles within these communities.

On to days we have presented the project, listened to talks by our invited guests and discussed the potentials of joint research in computer science and the social sciences, in this case religious studies. Click below for the full program.


(Weiter...  )

Springer Edited Volume `Compuational Sustainability' published

Katharina Morik and Kristian Kersting together wit Jörg Lässig from the University of Applied Sciences Zittau/Görlitz have published an edited volume on Computational Sustainability. Computational Sustainability is a broad field that attempts to optimize societal, economic, and environmental resources using methods from computer science, mathematics and related fields:

Springerl Jörg Lässig, Kristian Kersting, Katharina Morik, Computational Sustainability. Studies in Computational Intelligence, Volume 645 2016, Springer, ISBN: 978-3-319-31856-1, 2016.
(Weiter...  )

Best Paper Award of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE) 2015

The joint work "Predicting Purchase Decisions in Free To Play Mobile Games" of Kristian Kersting with colleagues from Wooga, goedle.io, Aalborg University, and the Fraunhofer IAIS received the Best Paper Award of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE) 2015.

LS8 is an international collaborator of CompSusNet

LS8 is an international collaborator of CompSusNet. CompSustNet is a research network sponsored by the National Science Foundation through an Expeditions in Computing award. Twelve U.S. academic institutions led by Cornell University, along with many national and international collaborators, are exploring new research directions in computational sustainability. (Weiter...  )

Morgan and Claypool Book on Statistical Relational AI

Together with colleagues from UBC, KU Leuven, and U. Indiana, Kristian Kersting published a book on Statistical Relational AI. This is the study and design of intelligent agents that act in worlds composed of individuals (objects, things), where there can be complex relations among the individuals, where the agents can be uncertain about what properties individuals have, what relations are true, what individuals exist, whether different terms denote the same individual, and the dynamics of the world:

Morgan and Claypool Luc De Raedt, Kristian Kersting, Sriraam Natarajan, David Poole, Statistical Relational Artificial Intelligence: Logic, Probability, and Computation. Morgan and Claypool Publishers,Synthesis Lectures on Artificial Intelligence and Machine Learning, ISBN: 9781627058414, 2016.
(Weiter...  )

VaVeL Project

Urban environments are flooded with data from fixed or mobile sensors that are gathering data. If these data were used successfully, European citizens could benefit in various areas like public transport or crime prevention. However, urban data is heterogenous, noisy and unlabeled, since the usability of the data is low. The VaVeL project aims towards using these data in application for increasing the living conditions in urban areas. The goal of the project is developing a general framework for managing and mining heterogenous urban data streams.

As part of the project, the functionality of current stream frameworks shall be fit on data streams from urban sensors. The access to urban data streams is not an easy task; In this project, a set of black boxes will be implemented to give easier access to the data and analysis procedures. Big data companies shall get an access to the gathered knowledge so that actual problems of an urban environment can be tackled.

(Weiter...  )

Call for Papers - Data Mining for Smart Cities

There is a Call for Papers for the journal Data Mining for Smart Cities. They are looking, for example, for the following topics:

  • Real-time nowcasting and prediction of events
  • Interactive exploration of city data
  • Feature extraction and deep learning from urban data

Submission is due July 4, 2016, 23:59.

(Weiter...  )

Katharina Morik in acatech - Deutsche Akademie der Technikwissenschaften - aufgenommen

Katharina Morik

The National Acadamy of Science and Engineering advises society and governments in all questions regarding the future of technology. Acatech is one of the most important academies for novel technology research. Additionally, acatech provides a platform for transfer of concepts to applications and enables the dialogue between science and industry. The members work together with external researchers in interdisciplinary projects to ensure the practiability of recent trends. Internationally oriented, acatech wants to provide solutions for global problems and new perspectives for technological value added in Germany.

By the appointment of Katharina Morik as member of acatech, the acadamy recognizes her research profile, her achievements as speaker of the collaborative research center SFB 876, her international reputation and innovative research in machine learning.

Christian Bockermann Defends his Dissertation at LS8

Christian Bockermanns Disputation

Christian Bockermann has successfully defended his dissertation with the title “Mining Big Data Streams for Multiple Concepts”. His thesis was supervised by Katharina Morik. Summary of the thesis:

Modelling streaming data applications in near real-time is motivated by today’s growing demand for in-time data analysis. The thesis reviews the Lambda architecture and state of the art frameworks for data streams and introduces a middle-layer easing the definition of streaming applications in a platform independent way. This enabling technique is demonstrated in two Big Data applications, namely the inline processing and analysis of data in Cherenkov astronomy and the near real-time extraction of viewership statistics in the context of an IP-TV platform.

(Weiter...  )

Fabian Hadiji verteidigt seine Promotion am LS8

Fabian Hadiji successfully defended his dissertation under the title "Graphical Models Beyond Standard Settings: Lifted Decimation, Labeling, and Counting". His thesis was supervised by Professor Kristian Kersting.

He summarises his thesis in the following abstract:

With increasing complexity and growing problem sizes in AI and Machine Learning, inference and learning are still major issues in Probabilistic Graphical Models (PGMs). On the other hand, many problems are specified in such a way that symmetries arise from the underlying model structure. Exploiting these symmetries during inference, which is referred to as "lifted inference", has lead to significant efficiency gains. This thesis provides several enhanced versions of known algorithms that show to be liftable too and thereby applies lifting in "non-standard" settings. By doing so, the understanding of the applicability of lifted inference and lifting in general is extended. Among various other experiments, it is shown how lifted inference in combination with an innovative Web-based data harvesting pipeline is used to label author-paper-pairs with geographic information in online bibliographies. This results is a large-scale transnational bibliography containing affiliation information over time for roughly one million authors. Analyzing this dataset reveals the importance of understanding count data. Although counting is done literally everywhere, mainstream PGMs have widely been neglecting count data. In the case where the ranges of the random variables are defined over the natural numbers, crude approximations to the true distribution are often made by discretization or a Gaussian assumption. To handle count data, Poisson Dependency Networks (PDNs) are introduced which presents a new class of non-standard PGMs naturally handling count data.

If you are interested in Fabian's past and future work, also see his personal homepage http://hadiji.com/.

Zweite On the Record im Signal Iduna Park

Being at the famous stadion of the Dortmund football team BVB 09, we could not resist to pretend giving a press conference. Actually, the conference that we visited was on economic journalism in the digital age. http://www.wipojo.de/ontherecord/

Best Paper Presentation Award of the "New Challenges in Neural Computation" Workshop 2015

The joint work "Archetypal Analysis as an Autoencoder" Of Kristian Kersting with colleagues from the University of Bonn and the Twenty Billion Neurons GmbH received the Best Presentation Award of the "Challenges in Neural Computation" (NC^2) Workshop of the GI-Fachgruppe Neuronale Netze and the German Neural Networks Society in connection to GCPR 2015, Aachen.

Successful Final Review of the EU Project INSIGHT in Luxemburg

The goal of the INSIGHT project was to radically advance our ability of coping with emergency situations in smart cities. INSIGHT stands for Intelligent Synthesis and Real-time Response using massive Streaming of Heterogeneous Data and the developed technologies for data stream mining put new capabilities in the hands of disaster planners and city personell to improve emergency planning and response.

(Weiter...  )

LS8 in Banff, Kanada

From July 24th to July 26th, the LS8 gave two talks at the Workshop "Advances in interactive Knowledge Discovery and Data Mining in Complex and Big Data Sets" in Banff, Canada.

Professor Katharina Morik spoke about "Big Data and Small Devices": Analyzing data on small devices confronts us with new challanges with regard to runtime, memory consumption and energy consumption. Her talk investigates the use of graphical models for data mining in resource-restricted environments and presents results from the research project SFB876. 

Furthermore,Sibylle Hess presented results of her diploma thesis: "Investigation of Code Tables to Compress and Describe Underlying Characteristics of Binary Databases". She connects traditional methods of frequent pattern mining with the Minimum-Description-Length principle and matrix factorization, combining these techniques into new algorithms for frequent pattern mining based on numerical optimization.

2nd Workshop on Mining Urban Data held in conjunction with ICML

We co-organize this years 2nd Workshop on Mining Urban Data. The workshop takes place July 11th at ICML Lille. Please see the proceedings at http://ceur-ws.org/Vol-1392/ This year we welcome three invited speakers: * Dr. Eleni Pratsini - "Using Big Mobile Data to Analyze Social Events in Cities" * Prof. Kristian Kersting - "Poisson Dependency Networks: Gradient Boosted Models for Multivariate Count Data" * Prof. Sharad Mehrotra - Towards `on the fly' data cleaning (Weiter...  )

Stellen für studentische Hilfskräfte ab sofort

An der TU Dortmund, Fakultät für Informatik am Lehrstuhl VIII, sind ab sofort Stellen für Studentische Hilfskräfte zu besetzen.

(Weiter...  )

Summer school

Summer School 2015

The next summer school will be hosted at the faculty of sciences of the university of Porto from 2nd to 5th of September and is collocated with ECMLPKDD 2015. It will be organize by LIAAD-INESC TEC and TU Dortmund.

For the summer school, world leading researchers in machine learning and data mining will give lectures on recent techniques for example dealing with huge amounts of data or spatio-temporal streaming data.

(Weiter...  )

SHE - Sie hat's erfunden

Bei den Unternhemenstagen , die vom 26.01.2015 bis zum 09.02.2015 stattfanden, waren Frauen im Berufsleben einer der Schwerpunkte der Veranstaltungsreihe. Um die Innovationen von klein und mittelständigen Unternehmen zu fördern, muss die Wahrnehmung von Frauen als Erfinderinnen gefördert werden. Professorin Katharina Morik nahm an einer Gesprächsrunde teil, die unter anderem über die Vereinbarkeit von Familie und Berufsleben und die mangelhafte Wahrnehmung von Frauen als Erfinderinnen diskutierte.

(Weiter...  )

Datenanalysten werden in der Wirtschaft gesucht

Datenanalyse ist in der Wirtschaft die gesuchte Kompetenz — die Lehre des LS 8 liefert den Studierenden auch im Sommersemester 2015 wieder das Wissen dazu! (Weiter...  )

RapidMiner Academics - Free Access to RapidMiner Studio for Students

RapidMiner CEO and founder Ingo Mierswa presents RapidMiner Academia: A program that grants students free access to commercial versions of RapidMiner Studio.

The RapidMiner project began in 2001, at that point still called YALE, at the LS8 here at TU Dortmund. Today it is one of the most popular software environments for predictive Data Analysis and Data Mining.

(Weiter...  )

Celebratory Colloquium at the Faculty for Computer Science

Katharina Morik

Ein besonderes Spektrum an Vorträgen fand am Jahresende zum 60. Geburtstag von Katharina Morik statt. Gemeinsam war den drei Hauptrednern, dass sie im Bereich Maschinelles Lernen bzw. Data Mining international höchst renommiert sind und bei Katharina Morik an der (Technischen) Universität Dortmund promovierten. Völlig verschieden ihre Tätigkeitsfelder.

Inhaltliche Gemeinsamkeiten der Redner mit der Jubilarin wurden in der kurzen Einführung deutlich, in der Katharina Morik ihre Forschungsziele zusammenfasste, die sie an der TU Dortmund verfolgt: situierte Systeme, die durch Lernfähigkeit Sensorik, Kommunikation und Handlung verbinden. Anfang der 90er Jahre entstanden Arbeiten zur Robotik: realzeitlich wurden in verteilten, heterogenen Datenströmen Muster entdeckt, die zur Handlungsplanung eingesetzt wurden. Der SFB 876 (Informatik), dessen zweite Phase gerade bewilligt wurde, kann mit seiner Verbindung von Datenanalyse und Cyber Physical Systems in der Leitlinie lernfähiger, situierter Systeme gesehen werden.

Die Arbeiten zu sehr großen Datenmengen, die Katharina Morik in 12 Jahren im SFB 475 (Statistik) zusammen mit Claus Weihs durchgeführt hat, wurden von der Sprecherin dieses Sonderforschungsbereichs, Ursula Gather, in einer kurzen Ansprache gewürdigt.

Ganz unterschiedliche Herangehensweisen, maschinelles Lernen erfolgreich zu erforschen und anzuwenden, wurden durch die Hauptvorträge der drei herausragenden Wissenschaftler deutlich.

Stefan Wrobel Thorsten Joachims Ingo Mierswa

Unsere Studierenden mag es freuen, wenn sie einige Beispiele sehen, wozu das Studium an der TU Dortmund befähigt: Forschungsdirektor, Professor, CEO einer Firma – auf der Grundlage herausragender Forschung zu maschinellem Lernen, Data Mining, Big Data Analytics lässt sich einiges machen!


(Weiter...  )

Frohe Weihnachten und ein schönes neue Jahr

Christmas 2014

From 19 of December 2014 until 9th of January 2015 the secretary's office is not occupied.
We wish you a merry christmas and a happy new year!


LS8 publishes SpringerBrief on boosting statistical relational learners

This SpringerBrief addresses the challenges of analyzing multi-relational and noisy data by proposing several Statistical Relational Learning (SRL) methods. These methods combine the expressiveness of first-order logic and the ability of probability theory to handle uncertainty. It provides an overview of the methods and the key assumptions that allow for adaptation to different models and real world applications. The models are highly attractive due to their compactness and comprehensibility but learning their structure is computationally intensive. To combat this problem, the authors review the use of functional gradients for boosting the structure and the parameters of statistical relational models. The algorithms have been applied successfully in several SRL settings and have been adapted to several real problems from Information extraction in text to medical problems. Including both context and well-tested applications, Boosting Statistical Relational Learning from Benchmarks to Data-Driven Medicine is designed for researchers and professionals in machine learning and data mining. Computer engineers or students interested in statistics, data management, or health informatics will also find this brief a valuable resource.

(Weiter...  )

Stellen für studentische Hilfskräfte ab Januar 2015

An der TU Dortmund, Fakultät für Informatik am Lehrstuhl VIII, sind ab Januar 2015 Stellen für Studentische Hilfskräfte zu besetzen.

(Weiter...  )

Vorlesung Natürlichsprachliche Systeme (Katharina Morik)

IBM Watson

Google, Facebook oder Netflix brauchen für viele ihrer Dienste die Verarbeitung natürlicher Sprache. So gibt es die große Abteilung Natural Language Processing bei Google http://research.google.com/pubs/NaturalLanguageProcessing.html

Das IBM-Programm Watson konnte im Februar 2011 in dem Quiz Jeopardy auf natürlichsprachliche Fragen besser antworten als zwei menschliche Quiz-Sieger.

Ray Kurzweil (Google Director of Engineering) möchte darüber hinausgehen: „So IBM’s Watson is a pretty weak reader on each page, but it read the 200m pages of Wikipedia. And basically what I'm doing at Google is to try to go beyond what Watson could do.“ http://searchengineland.com/ray-kurzweils-job-google-beat-ibms-watson-natural-language-search-185149 Es gibt eine Fülle von Methoden zur Analyse sehr großer Textmengen für ebenfalls viele Anwendungen: Sentiment Analysis, personalisierte Werbung, Empfehlungen, email Routing, automatische Texterstellung für Kurznachrichten und Reporting, automatische Fragebeantwortung, Informationsextraktion aus dem WWW. In der Vorlesung mit Übungen lernen Sie die Methoden und Werkzeuge dazu kennen. Das neue Lehrkonzept beinhaltet inverted class room Sitzungen und selbstständige Arbeiten, so dass Sie für die Praxis gerüstet sind. http://www-ai.cs.uni-dortmund.de/LEHRE/VORLESUNGEN/NLS/WS1415/index.html

Vorlesung Probabilistische Graphische Modelle (Kristian Kersting)

Wie handelt man unter Unsicherheit, bei fehlenden oder fehlerhaften Daten? Um mit solchen Unsicherheiten umgehen zu können, haben sich in den letzten Jahren probabilistische, graphischen Modellen bewährt. Sie gehören zu den Bemühungen der modernen Informationstechnik, das Schlussfolgern unter Unsicherheit zu ermöglichen.

Tag-Cloud Probabilistische graphische Modelle

Prominente Anwendungsfelder sind die Robotik, die Bioinformatik, die Künstliche Intelligenz, das Maschinelle Lernen. So kommen sie zum Beispiel in der Auswertung von medizinischen Daten, der Analyse von Genexpressionsdaten und dem Tracken von Bewegungen zum Einsatz. Gegenstand der Vorlesung "Probabilistische Graphische Modelle" des LS8 sind grundlegende Fragestellungen und Techniken der graphischen Modelle. http://www-ai.cs.uni-dortmund.de/LEHRE/VORLESUNGEN/PGM/WS1415/index.html

Vorlesung Large-Scale Optimization (Sangkyun Lee)


Ganz allgemein sind Daten oft billiger zu erhalten als das Wissen von Experten zu extrahieren und dann zu modellieren. Aber wie können Rechner automatisch große Modelle --- wie sie in der Verarbeitung natürlicher Sprache, bei dem Schätzen von Graphischen Modellen und im statischen Maschinellen Lernen auftreten --- aus Daten schätzen?

In den meisten Lernverfahren steckt als Kern eine Optimierungsaufgabe: der Fehler soll miniert oder die Wahrscheinlichkeit für das richtige Ergebnis maximiert werden. Die theoretischen Grundlagen und Methoden behandelt in englischer Sprache die Vorlesung "Large-Scale Optimization".

PG infoscreen (Kristian Kersting, Hendrik Blom)


Die Ansätze aus allen Vorlesungen können dann zur Anwendungen in der PG "Infoscreen" kommen. Infoscreens sind digitale Bildflächen und sollen eine besondere Aufmerksamkeit in "reizarmen" öffentlichen Räumen erzielen.

Es soll über Aktuelles an der Fakultät für Informatik der TU Dortmund informiert werden.

KDD 2014 sold out

KDD 2014 is sold out. They had to close registrations. 2200 attendees will enjoy the conference next week in Times Square. Katharina Morik gives a keynote talk at the workshop BigMine’14.

(Weiter...  )

The virtual steel works

After the press conference the LS8 project (Katharina Morik, Hendrik Blom, Tobias Beckers)  in collaboration with the SMS Siemag and the Dillinger Hütte is outlined in two interviews: Dominik Schöne of the Dillinger Hütte and Katharina Morik.

(Weiter...  )

ViSTA-TV in a Nutshell

The European project VistaTV had its successful final review meeting in Amsterdam, 1st of July. LS 8 contributed live stream analysis separating ads from shows in internet television. Online recommendations of shows based on user behavior have been produced based on Termset Clustering.


(Weiter...  )

Mediaday of the SMS Group in Hilchenbach

Mediaday of the SMS Group in Hilchenbach at 3. July 2014

Data Mining/ Industrie 4.0

Summary talk by Katharina Morik about "Data Mining, Big Data and Prediction Models"

(Weiter...  )

Talk at the TU Dortmund: What happens to our data? Between permanent harassment paranoia and post-privacy

Wednesday, 2. July 2014, 16:00 (s.t.) -18:30, P1-05-309
  • Kristian Kersting (Chair for artificial intelligence)
  • Sarah Küsgen (Chair for service and technology management)
  • Kai-Uwe Loser (Data security engineer of the RUB)
  • Johannes Weyer & Robin D. Fink (specific field technical sociology)

The youngest exposures of whistle-blowser Edward Snowden showed one more time the attractiveness of collecting massive data in the age of social media.

The question 'what happens to our data?', viewed from technical, economic and sociological background, will be investigated in the context of this event. The technical possibilities of modern data-mining are diverse and allow conclusions down to the individual level. Collected data from social networks are especially attractive for marketing and product design. Behind this background the protection of privacy will be assigned to new tasks.

The contributors will hold a 10-15 minutes talk each and will afterwards take part in a discussion with the audience. The event will be moderated by Johannes Weyer.

(Weiter...  )

Umzug Otto-Hahn-Str. 12

The chair for Artificial Intelligence is moving to the new building in Otto-Hahn-Str. 12. Thus, between 06/30/14 and 07/04/14 we may not be available at all times.

Talk at VigLink: Resource-aware graphical models

Prof. Morik talks at VigLink

Machine learning can help to enhance small devices. For instance, keeping the energy consumption of smart phones low is one of the major concerns of the users, as is well illustrated by various “charge your mobile” stations at public places. Where the operating systems of smart phones already offer heuristics and battery apps show consumption profiles, machine learning can do more. Predictions allow better optimizations of the operating system, prepare for particular app usages at certain points in time, or manage services such as GPS or WLAN in a context-aware and adaptive manner. This challenges learning algorithms to real-time application of their models. Moreover, it demands the models to run on the resource-restricted device without consuming more energy themselves than they save!

(Weiter...  )

Vortrag bei der NASA: Data Analytics for Sustainability

Title: Data Analytics for Sustainability

  • Speaker: Katharina Morik, Technische Universität, Dortmund
  • Date & Time: Wednesday, May 28, 2:00 pm - 3:00 pm
  • Location: Building N245 Auditorium


Sustainability has many facets and researchers from many disciplines are working onthem. Particularly knowledge discovery always considered sustainability an importanttopic (e.g., special issue on data mining for sustainability in Data Mining andKnowledge Discovery Journal, March 2012).

Host: Dr. Kamalika Das
NASA Ames Research Center
MS 269-1, PO Box 1, Moffett Field, CA 94035

PROF. MORIK setzt ihre Vortragsreihe bei google fort


On Tue 05/27/2014 Prof. Katharina Morik give a talk about "Resource-aware graphical models and spatio-temporal predictions" at the Google Headquarters in Palo Alto, California, USA.

Machine learning can help to enhance small devices. For instance, keeping the energy consumption of smart phones low is one of the major concerns of the users, as is well illustrated by various “charge your mobile” stations at public places. Where the operating systems of smart phones already offer heuristics and battery apps show consumption profiles, machine learning can do more. Predictions allow better optimizations of the operating system, prepare for particular app usages at certain points in time, or manage services such as GPS or WLAN in a context-aware and adaptive manner. This challenges learning algorithms to real-time application of their models. Moreover, it demands the models to run on the resource-restricted device without consuming more energy themselves than they save!
In the talk, graphical models are presented that face these challenges. Using Conditional Random Fields (CRF) for the prediction of files that the user will fetch next on her smart phone can be used by the operating system for organizing the memory. Analyzing groups of apps running on the smart phone may estimate the energy consumption over time.
A novel spatio-temporal random field (STRF) has been implemented, smoothing the temporal changes and distributing the optimization. This graphical model has been used to predict app usage over time. In another application, it has been combined with a trip planner resulting in smart routing for smart cities. In order to run graphical models on very restricted devices, even those withoutvfloating point calculation, one computing with integer values only has been developed. The integer approximation of graphical models shows good accuracy and speed-up and opens up novel applications on resource-restricted devices.



Sustainability has many facets and researchers from many disciplines are working on them. Particularly knowledge discovery always considered sustainability an important topic (e.g., special issue on data mining for sustainability in Data Mining and Knowledge Discovery Journal, March 2012).

(Weiter...  )

Prof. Morik gives a talk about 'Data Analytics for Sustainability' at the University of Maryland, Baltimore County on Thursday 22 May 2014.


Sustainability has many facets and researchers from many disciplines are working on them. Particularly knowledge discovery always considered sustainability an important topic (e.g., special issue on data mining for sustainability in Data Mining and Knowledge Discovery Journal, March 2012).

  • Environmental tasks include risk analysis concerning floods, earthquakes, fires, and other disasters as well as the ability to react to them in order to guarantee resilience. The climate is certainly of influence and the debate on climate change received quite some attention.
  • Energy efficiency demands energy-aware algorithms, operating systems, green computing. System operations are to be adapted to a predicted user behavior such that the required processing is optimized with respect to minimal energy consumption.
  • Engineering tasks in manufacturing, assembly, material processing, and waste removal or recycling offer opportunities to save resources to a large degree. Adding the prediction precision of learning algorithms to the general knowledge of the engineers allows for surprisingly large savings.

Global reports on the millennium goals and open government data regarding sustainability are publicly available. For the investigation of influence factors, however, data analytics is necessary. Big data challenges the analysis to create data summaries. Moreover, the prediction of states is necessary in order to plan accordingly. In this talk, two case studies will be presented. Disaster management in case of a flood combines diverse sensor data streams for a better traffic administration. A novel spatiotemporal random field approach is used for smart routing based on traffic predictions. The other case study is in engineering and saves energy in the steel production based on the multivariate prediction of the processing end-point by the regression support vector machine.

11:00am-12:30pm, Thursday 22 May 2014, ITE 456, UMBC

(Weiter...  )

Call for Papers - MLDM 2015

MLDM 2015

11th International Conference on Machine Learning and Data Mining

July 11 - 24, 2015, Freie Hansestadt Hamburg, Germany

This congress will feature three events the 11th International Conference on Machine Learning and Data Mining MLDM, the 15 th Industrial Conference on Data Mining ICDM ( www.data-mining-forum.de), and the 10 th International Conference on Mass Data Analyisis of Signals and Images MDA (www.mda-signals.de). Workshops and Tutorial will also be given.

  • Submission of papers: January 15th, 2015
  • Notification of acceptance: February 28, 2015
  • Submission of camera-ready copy: April 5th, 2015
(Weiter...  )

Katharina Morik in Wien

Dortmunder postdoc Wouter Duivesteijn wins C.J. Kok Jury Award 2013.
Prof. Dr. Dr. h. c. Monika Henzinger und Prof. Dr. Katharina Morik with some participants of the college, where Katharina Morik gives a course “Data Analytics”.

More than 1 year after the faculty of computer science at the TU Dortmund has conferred an honorary doctorate to Monika Henzinger, Professor at the University of Vienna, Katharina Morik gives a course on "Data Analytics" in the context of the interdisciplinary college at the computer science of the University of Vienna and also presented in a well-attended colloquium lecture results of the SFB876: "Big Data Analytics and Astrophysics".

Workshop: Needles In a Stream of Hay (NISH2014)

Workshop collocated with INFORMATIK 2014, September 22-26, Stuttgart, Germany.

This workshop focuses on the area where two branches of data analysis research meet: data stream mining, and local exceptionality detection.

Local exceptionality detection is an umbrella term describing data analysis methods that strive to find the needle in a hay stack: outliers, frequent patterns, subgroups, etcetera. The common ground is that a subset of the data is sought where something exceptional is going on: finding the needles in a hay stack.

Data stream mining can be seen as a facet of Big Data analysis. Streaming data is not necessarily big in terms of volume per se but instead it can be in terms of the high troughput rate. Gathering data for analyzing is infeasible so the relevant data of a data point has to be extracted when it arrives.


Submissions are possible as either a full paper or extended abstract. Full papers should present original studies that combine aspects of both the following branches of data analysis:

stream mining: extracting the relevant information from data that arrives at such a high throughput rate, that analysis or even recording of records in the data is prohibited;
local exceptionality mining: finding subsets of the data where something exceptional is going on.

In addition, extended abstracts may present position statements or results of original studies concerning only one of the aforementioned branches.

Full papers can consist of a maximum of 12 pages; extended abstracts of up to 4 pages, following the LNI formatting guidelines. The only accepted format for submitted papers is PDF. Each paper submission will be reviewed by at least two members of the program committee.

(Weiter...  )

NEM Position Paper of Big and Open Data

"NEM position papers are documents giving the NEM Initiative view on any subject related to the networked electronic media area. The NEM position papers typically include: letters of advice to the Commission, formal opinions submitted to the Commissioner, submissions to regulatory bodies, or any other formal statement of this nature, as well as further views of the NEM community on various technological, societal, and policy issues related to NEM." Source: www.nem-initiative.org

(Weiter...  )

Many companies hope for big data

Our students at LS 8 learn exactly what is in demand at many companies.

(Weiter...  )

Dortmunder postdoc Wouter Duivesteijn wins C.J. Kok Jury Award 2013.

Dortmunder postdoc Wouter Duivesteijn wins C.J. Kok Jury Award 2013.

Annually, the Faculty of Science at Leiden University, the Netherlands, grants the C.J. Kok Jury Award for the best PhD thesis of the past year. All institutes within the faculty (astronomy, physics, mathematics, computer science, chemistry, pharmacy, biology, and environmental sciences) are given the opportunity to nominate candidates for the award.

 Out of a pool of over 120 dissertations, the C.J. Kok Jury Award 2013 was won by Wouter Duivesteijn, with his thesis "Exceptional Model Mining". Notably, this is the first time ever that the award (existing since 1971) has been bestowed upon a computer scientist.

Book Announcement: RapidMiner: Data Mining Use Cases and Business Analytics Applications

The book "RapidMiner: Data Mining Use Cases and Business Analytics Applications" has been published on 6 November, 2013 by Chapman and Hall/CRC

"In this book, case studies communicate how to analyze databases, text collections, and image data. … How the given data are transformed to meet the requirements of the method is illustrated by screenshots of RapidMiner. The RapidMiner processes and datasets described in the case studies are published on the companion web page of this book. The inspiring applications may be used as a blueprint and a justification of future applications."
—From the Foreword by Professor Dr. Katharina Morik, Technical University of Dortmund

(Weiter...  )

SFB-Artikel des LS 8 von der ECML PKDD 2013 preisgekrönt

ECML presentationThe paper Spatio-Temporal Random Fields: Compressible Representation and Distributed Estimation by Nico Piatkowski, Sankyun Lee and Katharina Morik is the winner of this year's ECMLPKDD 2013 machine learning best student paper award. The ceremony took place on Monday, September 23rd, in Prague (www.ecmlpkdd2013.org).

The article has been selected out of 182 papers for the journal publication. With an acceptance rate of 7% there were 14 accepted journal publications. 124 papers were selected out of 460 submissions for the proceedings (acceptance rate 26%). From 138 accepted submissions alltogether 4 won the award for best paper. The above article from Nico Piatkowski, Sankyun Lee und Katharina Morik is one of these.

EDBT/ICDT 2014 Call for Workshops

On the last day of EDBT/ICDT 2014, 28. March 2014, there are some workshops. More information about formatting guidelines and registration can be found here.

Deadline: 7. December

(Weiter...  )

EDBT/ICDT 2014 Joint Conference: Call for papers

The International Conference on Extending Database Technology is a leading international forum for database researchers, practitioners, developers, and users to discuss cutting-edge ideas, and to exchange techniques, tools, and experiences related to data management. Data management is an essential enabling technology for scientific, engineering, business, and social communities. Data management technology is driven by the requirements of applications across many scientific and business communities, and runs on diverse technical platforms associated with the web, enterprises, clouds and mobile devices. The database community has a continuing tradition of contributing with models, algorithms and architectures, to the set of tools and applications enabling day-to-day functioning of our societies. Faced with the broad challenges of today's applications, data management technology constantly broadens its reach, exploiting new hardware and software to achieve innovative results.

EDBT 2014 invites submissions of original research contributions, as well as descriptions of industrial and application achievements, and proposals for tutorials and software demonstrations. We encourage submissions relating to all aspects of data management defined broadly, and particularly encourage work on topics of emerging interest in the research and development communities.

Deadline: 15. October 2013

(Weiter...  )

LS8 at the International Broadcasting Convention (IBC) with the EU project Vista-TV

The highly respected conference with an exhibition, IBC, takes place in Amsterdam and Vista-TV is one of the exhibitors. In the Future Zone, Vista-TV presents realtime analytics of Internet-TV use. (more)

"With more than 50,000+ attendees from more than 160 countries, IBC combines a highly respected and peer-reviewed conference with an exhibition that exhibits more than 1,400 leading suppliers of state of the art electronic media technology...
Run by the industry, for the industry, IBC is owned by six industry partners that represent both exhibitors and visitors." (http://www.ibc.org/page.cfm/link=628)
Vista-TV provides users with real-time recommendations of shows and an excellent overview of the current TV program that eases the selection of the channel. In addition, for the producers of shows and for marketing companies, Vista-TV offers a real-time statistics of watching behavior. How many use the smartphone, the computer or the large TV screen for watching Internet-TV right now? In which region are the watching users located? From which channel to which other channel do users switch frequently? All these real-time analyses respect the privacy of the users and do not allow to trace a specific user. The statistics, however, is a source of valuable information.

(Weiter...  )

Fußball-Analyse mit dem streams Framework - TechniBall gewinnt Audience-Award!

In enger Zusammenarbeit mit dem Technion (Israel Institute of Technology) entstand basierend auf dem *streams* Framework ein System zur Echtzeitanalyse von Fußball-Daten für den Wettbewerb der diesjährigen DEBS Konferenz. Aufgabe der Challenge war die Berechnung von Statistiken über das Lauf- und Spielverhalten der Spieler, die mit Bewegungs- und Ortungssensoren des RedFIR Systems (Fraunhofer) augestattet wurden.
Im Rahmen des Wettbewerbs entwickelte der Lehrstuhl 8 zusammen mit dem Technion das "TechniBall" System auf Basis des *streams* Frameworks von Christian Bockermann. TechniBall ist in der Lage, die erforderlichen Statistiken deutlich schneller als in Echtzeit (mehr als 250.000 Events pro Sekunde) zu verarbeiten und wurde vom Publikum des Konferenz zum Gewinner des DEBS Challenge 2013 gekürt.

(Weiter...  )

"Machine Learning and Knowledge Discovery in Databases" as one of the top 50% most downloaded eBooks at Springer

Since its online publication on Sep 04, 2008 there has been a total of 11732 chapter downloads of "Machine Learning and Knowledge Discovery in Databases". In 2012 it is still one of the top 50% most downloaded eBooks in the relevant Springer eBook Collection with 1055 downloads.

(Weiter...  )

BBC about the project Vista TV

The BBC blog about the project Vista-TV in which Libby Miller shows visualizations of user behavior. (Weiter...  )

UBICOMM 2013: Call for papers

The goal of the International Conference on Mobile Ubiquitous Computing, Systems, Services and Technologies, UBICOMM 2013, is to bring together researchers from the academia and practitioners from the industry in order to address fundamentals of ubiquitous systems and the new applications related to them. The conference will provide a forum where researchers shall be able to present recent research results and new research problems and directions related to them. The conference seeks contributions presenting novel research in all aspects of ubiquitous techniques and technologies applied to advanced mobile applications.

Deadline: 17. May 2013

(Weiter...  )

Stellen für studentische Hilfskräfte

An der TU Dortmund, Fakultät für Informatik am Lehrstuhl VIII sind ab sofort Stellen für Studentische Hilfskräfte im Umfang von bis zu 10 Wochenstunden zu besetzen. (Weiter...  )

TechniBall - Solution for the DEBS Challenge 2013

LS8 analysis football games in realtime! Each player is equipped with a sensor and so is the ball. The streams framework from LS8 is coupled with the Esper event recognition of Technion. (Weiter...  )

Mit Datenstrom-Algorithmen zum besseren TV-Erlebnis - ViSTA TV Coding Camp am Lehrstuhl 8

Fernsehen über das Internet (IP-TV) spielt eine immer größere Rolle in der heutigen Medienlandschaft. Größere Programmvielfalt, Fernsehen auf mobilen Geräten, oder Mediatheken sind nur ein paar Vorzüge de neuen Fernsehwelt. Um das TV-Erlebnis für jeden Zuschauer zu optimieren ist im Hintergrund jede Menge Hightech gefragt. Das EU-Projekt ViSTA-TV erforscht das TV-Verhalten von Benutzern, sucht nach ähnlichen Sendungen und versucht so, dem Zuschauer das bestmögliche Programm zu empfehlen. Von der Lieblingssendung zu interessanten Dokumentationen oder die neuesten Trends - in der Fülle der Angebote wird für jeden Zuschauer das richtige gefunden.

Das Projekt ViSTA-TV ist ein Gemeinschaftsprojekt der Universitäten Zürich, Amsterdam und des Lehrstuhl 8 der Informatik der TU Dortmund, sowie den Unternehmen BBC, Zattoo und der Dortmunder Firma Rapid-I. Ziel des Projektes ist die Analyse des Fernsehverhaltens von IPTV Nutzern um z.B. Empfehlungen von Sendungen möglichst genau an die Bedürfnisse und Vorlieben der Zuschauer anzupassen. Dafür wird das Ein- und Umschaltverhaltens der Benutzer, sowie Eigenschaften des Video-Signals (zB. Werbungserkennung) analysiert.

Eine Herausforderung stellt dabei die große Datenrate von Video-Daten, die in Echtzeit analysiert werden müssen. Dazu wurde die Datenstrom-Umgebung „streams“, die von Christian Bockermann am Lehrstuhl 8 entwickelt wurde, um die Fähigkeit der Video-Analyse erweitert. Dies ermöglicht die gleichzeitige Analyse von Video-Daten mit dem dazugehörigen Umschaltverhalten aus Log-Daten. Die Ergebnisse werden dann innerhalb eines Empfehlungssystems weiter verarbeitet um Nutzern einen maßgeschneiderten Blick auf das TV-Angebot zu bieten.

Mit im Blick haben die Forscher aus Dortmund dabei natürlich auch die Integra-tion weiterer Datenquellen, wie DBpedia, elektronische Fernsehzeitschriften oder die beliebte Internet Movie Database (imdb). Im Sinne des „Big Data“ Gedankens, werden alle diese Informationen zeitnah analysiert und lassen so auch Informationen über Schauspieler, Nachrichten oder aktuelle Trends auf Twitter und facebook mit in die Empfehlungen einfließen.

Coding-Camp an der TU

In dieser Woche findet im an der TU Dortmund das zweite Coding-Camp zum ViSTA-TV Projekt statt. Dabei stehen insbesondere die Integration der Module der Projektpartner im Mittelpunkt. Das Ziel des Coding-Camp ist ein erster lauffähiger Prototyp des Projektes, der Programmempfehlungen an Zuschauer über Handy-Apps anbietet.

Jugend Forscht: Regionalwettbewerb in Dortmund

Am 19. Februar findet in Dortmund der Regionalwettbewerb Jugend forscht statt. In den Räumen der DASA Arbeitswelt Austellung präsentieren die jungen Nachwuchsforscher ihre Ideen und Arbeiten in verschiedenen Forschungsgebieten der Jury. Für das Gebiet Mathematik/Informatik ist mit Christian Bockermann auch der Lehrstuhl 8 der Fakultät für Informatik und ein Mitarbeiter im Projekt C1 des SFB in der Jury vertreten.

Book on Managing and Mining Sensor Data published

The book Managing and Mining Sensor Data has been published as an ebook and will be available as hardcover from 28th of February 2013. The book has been supported by the collaborative research center by the authors Marco Stolpe (project B3, Artificial Intelligence) and the guest researcher Kanishka Bhaduri. They contributed the chapter on Distributed Data Mining in Sensor Networks.

Especially sensor networks provide data at different, distributed locations. For an efficient analysis new technologies need to calculate results even if communication ressources are constrained.

(Weiter...  )

IEEE International Conference on Data Mining

Katharina Morik organizes a Panel on the value of data at the IEEE International Conference on Data Mining (Weiter...  )

Zwei Wissenschaftliche Mitarbeiter gesucht

Der Lehrstuhl für künstliche Intelligenz sucht zum nächstmöglichen Zeitpunkt zwei wissenschaftliche Beschäftigte.

  • Für das Projekt DDMD (Data Driven Materials Development) wird ein/eine (Post-)Doktorand/in gesucht. Das Projekt läuft in Zusammenarbeit mit Univ. Duisburg-Essen und RUB. Weitere Details können der Ausschreibung entnommen werden.
  • Für das Projekt KobRA (Korpus-basierte linguistische Recherche und Analyse mit Hilfe von Data-Mining) wird ebenfalls eine/ein wissenschaftliche/r Beschäftigte/r gesucht. Dabei soll die Korpus-basierte Linguistik durch Methoden des Data Mining unterstützt werden. Weitere Details können der Ausschreibung entnommen werden.

New newspaper article about Katharina Morik published

The German newspaper "Westdeutsche Allgemeine Zeitung" has published an article about Katharina Morik. The full article can be found on their website. (Weiter...  )

Stellenausschreibung: Entwicklung einer prozessdatenbasierten realzeitlichen Parameteradaptierung in automatisierten Produktionsprozessen

Im Anwendungsfall energie- und ressourcenintensiver Industrien besteht die Herausforderung darin, steigende Produktqualität bei gleichzeitiger Reduzierung von Kosten und Produktionszeiten zu realisieren. Prinzipien und Methoden von Qualitätsmanagement- und Produktionssystemen nach dem Vorbild der japanischen Automobilindustrie rücken dabei als vorrangiges Leitbild branchenübergreifend in den Mittelpunkt. Als ein wesentliches Element des TPS leistet das Prinzip einer prozessimmanenten Qualitätskontrolle, auch bekannt unter den Begriffen Jidoka oder Autonome Automation, einen entscheidenden Beitrag. Jedoch ist das Jidoka-Prinzip im Fall automatisierter, verketteter Produktionsprozesse, wie sie beispielsweise in der Stahlindustrie vorzufinden sind, auf konventionellem Weg nicht ohne weiteres realisierbar.

Ziel dieses Promotionsvorhabens ist die Entwicklung und Validierung einer Systematik zur Ausschussminimierung und Produktqualitätsoptimierung im Kontext starr verketteter, automatisierter Produktionsprozesse. Ein möglicher Ansatz stellt dabei das Konzept der Advanced Process Control dar. Zentraler Gedanke ist dabei die realzeitliche, prozessdatenbasierte Überwachung und Auswertung von Produktionsprozessen mit dem Ziel, kurzfristige Prozessschwankungen ausgleichen und somit die Produktqualität sicherstellen zu können. Das Promotionsvorhaben soll für das oben skizzierte Produktionssystem einen Ansatz entwickeln, der basierend auf der automatisierten Auswertung von Prozessparametern entscheidet, ob die Qualität des aktuell bearbeiteten Produkts den Spezifikationen entspricht oder ob und in welcher Form eine Anpassung der Prozessparameter erforderlich und realzeitlich möglich ist, um die Qualitäts­spezifikationen zu erfüllen. Alternativ besteht eine weitere Entscheidungsmöglichkeit darin, das Produkt nicht weiter zu bearbeiten, wenn die Qualitätsabweichung durch Anpassung des Produktionsprozessablaufes nicht korrigiert werden kann.

Die Durchführung des Vorhabens umfasst neben der Entwicklung des theoretischen Konzeptes, eine simulationsbasierte Validierung sowie in enger Kooperation mit der Deutsche Edelstahlwerke GmbH am Standort Witten die Integration des Konzeptes in die betrieblichen Produktionsabläufe. Zur Lösung der Aufgabe soll auf den Einsatz modernster Data Mining-Techniken zurückgegriffen werden.

Betreuer: Prof. Deuse

Bewerbungen ab sofort an:

Dipl.-Wirt.-Ing. Uta Spörer
Tel.: +49 (231) 755 – 5787
Fax: +49 (231) 755 – 5772
E-Mail: spoerer@gsoflog.de
Mo- Do: 8:30 - 12:30 Uhr


(Weiter...  )

LWA2012 from 12.09. to 14.09. at the Computer Science Department

LWA stands for "Lernen, Wissen, Adaption" (Learning, Knowledge, Adaptation). It is the joint forum of four special interest groups of the German Computer Science Society (GI). Following the tradition of the last years, LWA provides a joint forum for experienced and for young researchers, to bring insights to recent trends, technologies and applications, and to promote interaction among the SIGs. (Weiter...  )

HIGHLIGHTS from the 5th Annual Rexer Analytics Data Miner Survey (2011)

  • SURVEY & PARTICIPANTS: 52-item survey of data miners, conducted on-line in 2011. Participants: 1,319 data miners from over 60 countries.
  • FIELDS & GOALS: Data miners work in a diverse set of fields. CRM/Marketing has been the #1 field for the past five years. Fittingly, “improving the understanding of customers”, “retaining customers” and other CRM goals continue to be the goals identified by the most data miners.
  • ALGORITHMS: Decision trees, regression, and cluster analysis continue to form a triad of core algorithms for most data miners. However, a wide variety of algorithms are being used. A third of data miners currently use text mining and another third plan to do so in the future.
  • TOOLS: R continued its rise this year and is now being used by close to half of all data miners (47%). R users report preferring it for being free, open source, and having a wide variety of algorithms. Many people also cited R's flexibility and the strength of the user community. STATISTICA is selected as the primary data mining tool by the most respondents (17%). Data miners report using an average of 4 software tools. STATISTICA, KNIME, Rapid Miner and Salford Systems received the strongest satisfaction ratings in 2011.
  • ANALYTIC CAPABILITY & SUCCESS MEASUREMENT: Only 12% of corporate respondents rate their company as having very high analytic sophistication. However, companies with better analytic capabilities are outperforming their peers. Respondents report analyzing analytic success via Return on Investment (ROI) and analyzing the predictive validity or accuracy of their models. Challenges to measuring success include client or user cooperation and data availability/quality.
  • SHARED INSIGHTS: In the 2010 Survey data miners shared best practices in overcoming the key challenges data miners face ( verbatims ). In the 2011 Survey data miners shared their best practices for measuring analytic success ( verbatims ) and examples of the positive impact that data mining can have to benefit society, health, and the world ( verbatims ). Additionally, 225 R users shared information about how and why they are using R ( verbatims ).
After the 2011 survey, Rexer Analytics Data Miner Survey has moved to a bi-annual schedule; the next Data Miner Survey will be launched in early 2013. Information about Rexer Analytics is available at www.RexerAnalytics.com (Weiter...  )

Grant application in line with the 4th call for proposals of the Mercator Research Center Ruhr (MERCUR) granted

The grant application Data Driven Materials Design (DDMD) was granted in line with the 4th call for proposals of the Mercator Research Center Ruhr (MERCUR). The project is a cooperation between Prof. Dr. Ralf Drautz, Prof. Dr. Alfred Ludwig, (both Ruhr-University Bochum), Prof. Dr. Katharina Morik (Chair 8) and Prof. Dr. Sven Rahmann (University Duisburg-Essen). The connected usage of experimental high-through-put-methods and analytic modeling in materials research, especially in the fields of thin-layer-material-libraries, "Attribute-Screenings" and "Advance Materials Simulation", is one of Ruhr-University's unique features, which is intended to be strengthened with this application. The mentioned fields have in common that they generate an extremely huge amount of multidimensional data that can not be analyzed efficiently without the help of computers. Analyzing huge amounts of data is one of TU Dortmund's focuses of which in this case particularly Data Mining is addressed. At Univerity Essen-Duisburg high-through-put-analysis is in front. The intention of this colaboration is to initiate a more rational development of new materials. The application tends to establish the foundation for the field of Data Driven Material Development. In this field new discoveries as well as new comprehensions (e.g. unknown phases or special physical properties) are supposed to be gained. In addition the development of new materials is to be speeded up. (Weiter...  )

Beste Bewertung: EU-Antrag INSIGHT

The application "INtelligent Synthesis and Real-tIme Response using Massive StreaminG of HeTerogeneous Data" is the best rated one in field FP7 "Intelligent Information Management" reaching 14.5 of possible 15 points. The Department of Computer Science at the TU Dortmund is involved with chair 8. Coordinator is Dimitrios Gunopulos (National University Athen). It is about analysing the huge amount of heterogeneous datastreams from sensors, mobile phones and control systems to enhance the management of cases of emergency. Examples are taken from the city of Dublin and the german Federal Office of Civil Protection and Disaster Assistance. The innovation of Data Mining finally enabled analysing social networks (e.g. Twitter), sensor-networks, and traffic-systems in connection and integrating citizens in this process.

ViSTA-TV started on June 1st

Live video content is increasingly consumed over IP networks in addition to traditional broadcasting. The move to IP provides a huge opportunity to discover what people are watching in much greater breadth and depth than currently possible through interviews or set-top box based data gathering by rating organizations. The ViSTA-TV project proposes to gather consumers’ anonymized viewing behavior and the actual video streams combined with enhanced electronic program guide information. ViSTA-TV will be in the position to provide highly accurate market research information about viewing behavior that can be used for a variety of analyses of high interest to all participants in the TV-industry. Furthermore ViSTA-TV will employ the information gathered to build a recommendation service. ViSTA-TV is an European Union-funded research project, beginning on 1 June 2012, and lasting for two years. The Artificial Intelligence Group participates alongside 5 other partners. (Weiter...  )

RapidMiner tested

Rapid-I is based in Dortmund, Germany, and has been working on RapidMiner, a Data-Mining-Software, since 2001. With its wide range of other tools such as RapidAnalytics, RapidLab, RapidNet and RapidSentilyzer it could win over such clients as Siemens, Allianz and Pepsico. The website JTonEDM.com introduces Rapid-I and its software RapidMiner in a short overview. (Weiter...  )

NEU MASTER-/ DIPLOMARBEIT ZU VERGEBEN: Effiziente Erfassung von Concept Drifts bei zyklischen Veränderungen in Stahlwerksprozessen

In heutigen Industrieanlagen zeichnen Sensoren während des Produktionsprozesses große Mengen an Daten auf. Aus diesen Daten wird schon während des laufenden Prozesses auf die Qualität des Endprodukts geschlossen. Produktionsbedingt kommt es während des laufenden Prozesses zu Veränderungen von Anlagenkomponenten und Messtechnik, die nur zyklisch instand gehalten werden können. Die Veränderungen spiegeln sich auch in den Prognosemodellen wieder. Es kommt zum Concept Drift. SMS Siemag und ein führender Hersteller von Grobblechen stellen für diese Arbeit aktuelle Produktionsdaten zur Verfügung. Im Rahmen der Bachelor, Diplom- oder Masterarbeit sollen Strategien zur Identifizierung von Concept Drifts und zur Stabilisierung der Prognosegüte entwickelt werden. Eine besondere Herausforderung stellt es dar, dass die Concept Drift Erkennung und Bereinigung in Realzeit geschehen soll. Der Schwerpunkt der Arbeit liegt daher auf der Auswahl, der Implementierung und dem Vergleich besonders effizienter Verfahren zur Entdeckung von Concept Drifts. (Weiter...  )

NEU MASTER-/ DIPLOMARBEIT ZU VERGEBEN: Steuerung von Prozessen in der Stahlproduktion mit Hilfe von multikriterieller Optimierung

In heutigen Industrieanlagen zeichnen Sensoren während des Produktionsprozesses große Mengen an Daten auf. Aus diesen Daten wird schon während des laufenden Prozesses auf die Qualität des Endprodukts geschlossen. Bisher wird von der Optimierung meist nur eine Zielgröße behandelt. Die Qualität des Endproduktes hängt aber oft von mehreren Zielgrößen ab, die sich obendrein widersprechen können. Dies kann nun als multikriterielles Optimierungsproblem formalisiert werden. Insbesondere muss eine geeignete Fitnessfunktion bestimmt werden. Dann können die Anwender aus den pareto-optimalen Lösungen Handlungsempfehlungen ableiten. Am LS8 stehen Sensordaten über den Produktionsprozess eines führenden Herstellers von Grobblechen zur Verfügung. An diesem Beispiel kann die Formalisierung von widersprüchlichen Zielgrößen als multikriterielle Optimierung untersucht werden. Dabei können Implementierungen in Rapid-Miner genutzt werden. Die genaue Aufgabenstellung wird darauf angepasst, ob es eine Bachelor, Diplom- oder Masterarbeit wird. (Weiter...  )

Special Issue of the international journal Data Mining and KnowledgeDiscovery published!

Together with Kanishka Bhaduri and Hillol Kargupta, Katharina Morik has edited a special issue of the international journal Data Mining and Knowledge Discovery. The special issue on Data Mining for Sustainability including a comprehensive introduction is now online at http://www.springerlink.com/. (Weiter...  )

Projektgruppenvorstellung "Kooperatives Datamining mit vernetzen Robotern"

Die Projektgruppe "Kooperatives Datamining mit vernetzen Robotern" wird am 22.12.2011 um 14:00 Uhr (s.t.) in den neuen Räumlichkeiten des Lehrstuhls 8 (Joseph-von-Frauenhofer Straße 23 in Raum 1.48) präsentiert.

Neue Diplom-/Masterarbeit zu vergeben: Personalisierung von Hotelempfehlungen anhand von Klickpfaden

Die Suche und Buchung von Hotels über das Internet wird heute üblicherweise über spezielle Portale abgewickelt. Die reine Filterung anhand von Suchkriterien führt häufig zur Ausgabe einer noch immer unüberschaubaren Anzahl von Hotels. Für die langfristige Bindung von Kunden an ein Portal ist es jedoch entscheidend, so schnell wie möglich Hotels anbieten zu können, die für die jeweilige Person (oder Personengruppe) tatsächlich geeignet sind. Mittels Methoden des Data Minings und maschinellen Lernens sollen Benutzerpräferenzen gelernt werden, die personalisierte und damit geeignetere Empfehlungen von Hotels ermöglichen. Hierzu werden vom weltweit führenden Portalbetreiber "Hotel Reservation Service" (HRS) Daten über Hotels, Portalbesucher, Kunden, Buchungen und Hotelbewertungen zur Verfügung gestellt. (Weiter...  )

KDD 2011 Workshop on Data Mining Applications in Sustainability in San Diego, CA

The annual ACM SIGKDD conference is the premier international forum for data mining researchers and practitioners from academia, industry, and government to share their ideas, research results and experiences. KDD-2011 will feature keynote presentations, oral paper presentations, poster sessions, workshops, tutorials, panels, exhibits, demonstrations, and the KDD Cup competition. KDD-2011 will run from August 21-24 in San Diego, CA and will feature hundreds of practitioners and academic data miners converging on the one location. (Weiter...  )

Übersicht über den Einfluss führender Datenbank und Data Mining Journale 2010 veröffentlicht

Being in the editorial boards of Knowledge and Information Systems (KAIS) and of Data Mining and Knowledge Discovery (DMKD), Katharina Morik happily presents the impact factors (2010) of some leading database and data mining journals:
  • ACM Transactions on Information Systems (TOIS): 1.085
  • ACM Transactions on Database Systems (TODS): 1.216
  • Data Mining and Knowledge Discovery (DMKD): 1.238
  • Information Systems (IS): 1.595
  • Data and Knowledge Engineering (DKE): 1.717
  • IEEE Transactions on Knowledge and Data Engineering (TKDE): 1.847
  • Machine Learning (ML): 1.956
  • Knowledge and Information Systems (KAIS): 2
Download the complete list

New Topic for a Master-/DA- Thesis: Feature Extraction from video-data

Neben YouTube und Co. wird das Internet mit zunehmender Bandbreite auch für klassisische Fernsehübertragungen immer interessanter. War IP-TV bisher meist für große Sportereignisse im Fokus, bieten Firmen wie z.B. zattoo.com bereits die Möglichkeit sich einer Vielzahl unterschiedlicher Kanäle zu bedienen, Sendungen online aufzuzeichnen und zu Archivieren. Aber wie findet man interessante Sendungen? Welche Informationen geben Aufschluß über Programme die mir gefallen? Lassen sich Spartensender allein anhand der Informationen aus den Video-Daten unterscheiden? In dieser Master-Arbeit geht es um die Extraktion von Merkmalen, die für die Klassifikation oder die Gruppierung von Sendungen, Sendern oder Fernsehzuschauern wichtig sind. (Weiter...  )

Feature Selection Extension for RapidMiner - NEW RELEASE 1.1.3

The Feature Selection Extension für RapidMiner 5 contains some operators for feature selection and -weighting and for classification. All operators are also highly suitable for high-dimensional data, e.g. microarray data. New in this version are:
  • RCCW - Recursive Conditional Correlation Weighting a very fast feature subset selection method.
  • FCBF - Fast Correlation Based Feature Selection
  • PAM - Classification by Shrunken Centroids
  • BAHSIC - Backward Feature Selection via Hilbert-Schmidt information criterion
  • t-Test - Computes a p-Value for the difference of the mean values between two classes
  • Test Significance - Assumes normal distribution, then checks for equal class variances via F-test and afterwards computes p-Value via t-Test or Welch-test
  • Benjamini-Hochberg-Correction - Performs the correction for FDR on significance values in an AttributeWeights object
Already available since older version are - amongst others - Recursive Feature Elimination (RFE) and minimum Redundancy Maximum Relevance Feature Selection (MRMR) / Correlation based Feature Selection (CFS) and a meta-operator for ensemble feature selection. The most recent version is available for free from SourceForge: https://sourceforge.net/projects/rm-featselext/ . (Weiter...  )

RapidMiner is most popular data mining tool according to KDnuggets poll

RapidMiner is again the most popular data mining tool in KDnuggets poll. (Weiter...  )

Colloquium of the Collaborative Research Center SFB 876 on June 30th, 2011: Prof. Preeti Ranjan Panda (Indian Institute of Technology Delhi)

Graphics processor (GPU) architectures have evolved rapidly in recent years with increasing performance demanded by 3D graphics applications such as games. However, challenges exist in integrating complex GPUs into mobile devices because of power and energy constraints, motivating the need for energy efficiency in GPUs. While a significant amount of power optimisation research effort has concentrated on the CPU system, GPU power efficiency is a relatively new and important area because the power consumed by GPUs is similar in magnitude to CPU power. Power and energy efficiency can be introduced into GPUs at many different levels: (i) Hardware component level - queue structures, caches, filter arithmetic units, interconnection networks, processor cores, etc., can be optimised for power. (ii) Algorithm level - the deep and complex graphics processing computation pipeline can be modified to be energy aware. Shader programs written by the user can be transformed to be energy aware. (iii) System level - co-ordination at the level of task allocation, voltage and frequency scaling, etc., requires knowledge and control of several different GPU system components. (Weiter...  )

Colloquium of the Collaborative Research Center SFB 876 on June 9th, 2011: Prof Piero Bonatti (University of Naples)

An increasing amount of information is being encoded via ontologies and knowledge representation languages of some sort. Some of these knowledge bases are encoded manually, while others are generated automatically by information extraction techniques. In order to protect the confidentiality of this information, a natural choice consists in encoding policies with the same language as the ontology language. This approach led to so-called "semantic web policies". The semantic web is founded on two knowledge representation languages: description logics and logic programs. In this talk we compare their expressive power as *policy* representation languages, and show that logic programming approaches are currently more mature than description logics, although this picture may change in the near future. (Weiter...  )

Colloquium of the Collaborative Research Center SFB 876 on May 5th, 2011: Henrik Blunck (University of Aarhus)

Emerging and envisioned applications within domains such as indoor navigation, fire-fighting, and precision agriculture still pose challenges for existing positioning solutions to operate accurately, reliably, and robustly in a variety of environments and conditions and under various application-specific constraints. This talk will first give a brief overview of efforts made in a Danish project to address challenges as mentioned above, and will subsequently focus on addressing the energy constraints imposed by Location-based Services (LBS), running on mobile user devices such as smartphones. A variety of LBS, including services for navigation, location-based search, social networking, games, and health and sports trackers, demand the positioning and trajectory tracking of smartphones. To be useful, such tracking has to be energy-efficient to avoid having a major impact on the battery life of the mobile device, since the battery capacity in modern smartphones is a scarce resource, and is not increasing at the same pace as new power-demanding features, including various positioning sensors, are added to such devices. We present novel on-device sensor management and trajectory updating strategies which intelligently determine when to sample different on-device positioning sensors (accelerometer, compass and GPS) and when data should be sent to a remote server and to which extent to simplify it beforehand in order to save communication costs. The resulting system is provided as uniform framework for both position and trajectory tracking and is configurable with regards to accuracy requirements. The effectiveness of our approach and the energy savings achievable are demonstrated both by emulation experiments using real-world data and by real-world deployments. (Weiter...  )

MonetDB: Open-source Columnar Database Technology Beyond Textbooks - Vortrag von Stefan Manegold

Stefan Manegold from CWI Amsterdam will be giving a talk on the column-store DBMS MonetDB on 2011/02/11 um 16.00 at Room E23, Otto-Hahn-Straße 14.

Column-store database management systems have recently experienced a considerable popularity-boost. The underlying ideas, however, date back to (at least) the mid 1980's and the technology has been pioneered since the early 1990's in the MonetDB system, a column-store research prototype that has been developed into a complete SQL- and XML/XQuery-compliant column-store DBMS freely available in open source. Next to its column-store back-bone, MonetDB focuses on high-performance hardware-conscious algorithms, novel workload-adaptive query processing techniques such as "cracking", "recycling" and run-time query optimization, and extensibility at all layers of its software stack.

In this talk, we will provide detailed insight into MonetDB's column-store architecture and query-processing technology as available in open-source, discussing its benefits for data mining, OLAP, BI, as well as science workloads.

Eröffnungskolloquium des SFB 876 - Jetzt Folien Online!

The new Collaborative Research Center SFB 876 "Providing Information by Resource-Constrained Data Analysis" starts the new year with a kick-off colloquium. The colloquium takes place on January 20th 2011 starting at 4 pm at auditorium E23, Otto-Hahn-Straße 14, TU Dortmund University campus. For further information about the program and speeches please have a look at the attachment.

SFB 876 - Die Bewerbungsfrist ist abgelaufen

At this time, no futher applications for open positions at the SFB 876 are being accepted.

SFB 876 granted!

The DFG granted the SFB 876. (Weiter...  )

Presentations online!

First presentations and pictures available on the MODAP workshop website. (Weiter...  )

First International Workshop on Social and Privacy aspects of the Mobility

Analyzing huge amounts of mobility data has posed new challenges not only in the discovery and interpretation of interesting patterns, but also in the privacy preservation of individuals under observation. However, the social and privacy aspects of mobility have not been studied in a systematic and combinatorial way, while the census and the conception of their effects in our lives is rather in childhood. The convergence of these complementary aspects, and more specifically, the way that mobility affects (or is affected by) the social behavior of individuals and their privacy, emerges the exciting new area of "socio-mobility". Socio-mobility arises a number of challenging questions. Are people moving together socially related? Are there social relations between people moving to semantically similar places? How could we combine mobility data and patterns with social networking information? Can social interactions be mined from mobility data by using external media? To what extend do social interactions affect privacy? What are the risks of disclosing social interactions between people and how can we design privacy-preserving techniques to minimize the risks? What kinds of social interactions are considered sensitive and how can we model / distort / suppress such interactions? (Weiter...  )

Interdisciplinary College in Günne in Günne at Lake Möhne, 25.März - 1. April 2011

The Interdisciplinary College (IK) is an annual, intense one-week spring school which offers a dense state-of-the-art course program in neurobiology, neural computation, cognitive science/psychology, artificial intelligence, robotics and philosophy. It is aimed at students, postgraduates and researchers from academia and industry. (Weiter...  )

PG 542 Final presentation

The student project group 542 "Stream Mining for Intrusion Detection in Distributed Systems" has succesfully finished their work on a generic framework for online and distributed data mining. All results including system's architecture, evluation of learning algorithms and a live demo covering the use case of intrusion detection will be presented on Thursday, 28th October, 10.15 at GB 4 room 136. (Weiter...  )

RapidMiner Hierarchical Heavy Hitters Plugin

After presenting our paper "Implementing Hierarchical Heavy Hitters in RapidMiner: Solutions and Open Questions" at the RCOMM 2010, we have released all accompanying Java code as RapidMiner 5 plugin. The plugin can be used to calculate Hierarchical Heavy Hitters on system call data. It furthermore contains domain-independent implementations of the related algorithms in Java. (Weiter...  )

RapidMiner Microarray Feature Selection Plugin released

The Microarray Feature Selection Plugin for RapidMiner 5 contains some feature selection and -weighting operators useful for working on high-dimensional (microarray-) data. These are - amongst others - Recursive Feature Elimination (RFE) and minimum Redundancy Maximum Relevance Feature Selection (MRMR) / Correlation based Feature Selection (CFS) and a meta-operator for ensemble feature selection. (Weiter...  )

RapidMiner Information Extraction Extension

The RapidMiner Information Extraction extension supports Information Extraction techniques in RapidMiner. Visualizers, annotators and preprocessing operators have been implemented for textual purpose. Structured models - namely Conditional Random Fields - for the extraction of named entities are available. Operators to extract relations with will be available soon. (Weiter...  )

Summer School on Mobility, Data Mining, and Privacy

The 1st Summer School on Mobility, Data Mining, and Privacy is co-organized by the FP7/ICT project MODAP "Mobility, Data Mining, and Privacy" (www.modap.org) and the COST Action IC0903 MOVE "Knowledge Discovery from Moving Objects" (http://move-cost.info/). This is the first doctoral school ever on the 'hot' intersection of three domains: modeling and management of moving object databases (Mobility), data analysis and knowledge discovery from mobility data (Data Mining), and privacy aspects that raise when processing human mobility (Privacy). (Weiter...  )

NEW Book: Ubiquitous Knowledge Discovery

Knowledge discovery in ubiquitous environments is an emerging area of research at the intersection of the two major challenges of highly distributed and mobile systems and advanced knowledge discovery systems. The new book, edited by Michael May and Lorenza Saitta, provides a state-of-the-art survey. It is the outcome of a large number of workshops, summer schools, tutorials and dissemination events of the European project KDubiq. (Weiter...  )

Initiative zur Datenanalyse unter Ressourcenbeschränkungen - Treffen in Bommerholz am 24./25.8.09

Bringing together embedded systems and data mining enables new solutions in computer science, bio medicine, physics and mechanical engineering. Embedded system can be further improved using machine learning while data mining algorithms can be realized in hardware, e.g. FPGAs. The restrictions in computing power, memory and energy demands new algorithms for known learning tasks. At Bommerholz 26 scientists and researchers from TU Dortmund and University Duisburg-Essen came together to gain a deeper understanding of the topic and exchange progress of ongoing projects.

RapidMiner -- most used open source data mining tool

RapidMiner is the most successful open source data mining tool for the third year in series -- only the commercial product Clementine (SPSS PASW Modeler) is more popular. (Weiter...  )


Asian Conference on Machine Learning
November 8-10, 2010, Tokyo, Japan (Weiter...  )

Special Issue on Sustainability of the Data Mining and Knowledge Discovery Journal

Special Issue on Sustainability of the Data Mining and Knowledge Discovery Journa (Weiter...  )

Recording of Talk

Katharina Morik "Handling Texts -- A Challenge for Data Mining" talk (in English), introduced by Jean-Gabriel Ganascia on the 9th francophone expert conference on Machine Learning and Data Mining, Strasbourg 2009

(Needs Microsoft Media Player Plugin):
<img src="http://canalc2.u-strasbg.fr/images/fondWM.gif" width="240" height="180" align="top" />

Videolink for other players

(Weiter...  )

Machine learning and biology

Lecture of Yoav Freund at the ECML PKDD 2008 about machine learning and biology

BioDatatbases(bioDatabases.m4v, 170.6 MB)

(Weiter...  )

Informatik kompakt

Based on the experiences of the 1999 lecture DAP1 a new textbook has finally arisen. This book introduces the fundamentals of the common core of different computer science areas by means of the programming language JAVA. (Weiter...  )

Chancengleichheit von Frauen an Universitäten

Prof. Dr. Katharina Morik was asked for a statement about equal opportunity for women.

The resulting TV report was shown on 07/11/2007 during the "tagesschau" news broadcast.

Source: Tagesschau-archive


Several programs have been developed at the AI unit within its research activities, such as myKLR, SVMlight, mySVM, RapidMiner (formerly YALE), the Information Layer or the USCHIFICATOR. Check our software page for a complete list. (Weiter...  )

LS8-Sekretariat zum Jahreswechsel geschlossen

The LS8 office will not be staffed between 24.12.2021 and 02.01.2022. During this time the TU Dortmund is completely closed.

With good wishes for a peaceful holiday season and a confident, healthy start into the year 2022! Your LS8-Team

Der Lehrstuhl zieht um!

Statuen der Osterinsel