The summer school will be accompanied by an exchange platform for participants, the Students' Corner, which will
allow them to network and share their research.
Furthermore, researchers from ML2R and SFB 876 will also present their work there.
Participants therefore need to register for the Summer School and should apply for the Students' Corner with an
abstract for the research they want to present.
A committee will select the most interesting applications. Registered participants will be informed within two
weeks whether their application has been accepted.
How does the Students' Corner work?
Each day at 3PM CEST the Students' Corner opens.
Participants present their research in defined time slots of approx. 2h on our online platform.
This does not mean that you have to give a full 2h presentation!
The Student's Corner works similar to poster presentations at a conference where you can informally talk with
visitors about your research.
At the beginning of the Student's Corner, a random group of participants will visit your corner.
After 30 minutes groups will switch and a new group of participants will visit your corner.
For each day there will be an catalogue with available presentations.
The participants can choose the form of their presentation themselves.
Possible presentation types are e.g.:
Other participants will enter the virtual space of a presentation, listen and discuss with the presenting
- A screencast of a software of an analysis/project (video),
- A hands on Session: Live screen sharing and explaining a project,
- A slide presentation, e.g. in Powerpoint,
- A poster presentation.
Student's Corner Catalog
The following presentation will be given by participants in our Student's Corner.
Autonomous vehicles that operate in real-world environments are subjected to unpredictable conditions. Successful maintenance of the position in such environments has been recently achieved with the aid of Deep Reinforcement Learning. Such superhuman operations became only possible by the integration of Deep Neural Networks (DNN). Despite their superior performance, DNNs are often seen as black boxes as neither the acquired knowledge nor the decision rationale can be explained. This raises issues with the transparency and trustworthiness of the system because the underlying AI models are not governed by any mathematical or physics laws. Due to these issues, regardless of DNN’s unsurpassed performance, many real-world applications opt-out to use traditional prediction mechanisms. In this paper, we attempt to explain the process of a DNN utilised in an autonomous dynamic positioning system. The explainability of DNN stems from a bio-inspired model which gauges reactions of the DNN to predefined limits. Sequences of time-series data corresponding to the DNN response and the environment are captured for analysis. The existence of zero overlapping repeating patterns in the time series of DNN response indicates that the DNN has learned and can predict the environmental conditions. However, explaining the DNN process using only these repeating pattern is inadequate as these patterns do not provide a connection to environmental conditions. Hence, we introduce a novel technique to reduce these sequences and the corresponding conditions into single-digits. This enables obtaining a comparable view of the entire scenario. By applying the K-Nearest-Neighbour algorithm on the above comparable view, 3 distinct clusters were identified. They expressed multiple intensities of environmental conditions namely, 44% of moderate conditions and 33% and 23% of harsh and mild conditions respectively. The centroids and the dispersion of the above clusters revealed that the DNN displays behaviours similar to the human response on a tightrope walking.
The intrinsic dimensionality refers to the ""true"" dimensionality of the data, as opposed to the dimensionality of the data representation. For example, when attributes are highly correlated, the intrinsic dimensionality can be much lower than the number of variables. Local intrinsic dimensionality refers to the observation that this property can vary for different parts of the data set; and intrinsic dimensionality can serve as a proxy for the local difficulty of the data set. Most popular methods for estimating the local intrinsic dimensionality are based on distances, and the rate at which the distances to the nearest neighbors increase, a concept known as ""expansion dimension"". We introduce an orthogonal concept, which does not use any distances: we use the distribution of angles between neighbor points. We derive the theoretical distribution of angles and use this to construct an estimator for intrinsic dimensionality. Experimentally, we verify that this measure behaves similarly, but complementarily, to existing measures of intrinsic dimensionality. By introducing a new idea of intrinsic dimensionality to the research community, we hope to contribute to a better understanding of intrinsic dimensionality and to spur new research in this direction.
Embedding technology with societal problems to achieve one of the UN’s 17 Sustainable Development Goals (SDGs) i.e. the 3rd SDG; Good Health and Well-Being. For the Visually impaired population of the world, this paper proposes an assistant, which detects and identifies objects in real-time and also provides feedback to its user. The main purpose is to assist the visually impaired user in an in-door environment by receiving an audio query and giving an audio feedback. The proposed system consists of a microcomputer attached to a depth sensing camera and an audio I/O device. The system is capable of completing all the necessary processing, i.e. detecting objects in the surrounding, measuring the distance and direction (location) of the objects. Identifying objects in the user’s query by recognizing user speech and converting it into suitable text. Identify and detect the location of the obstacle and provide audio feedback to the user. The visually impaired user just has to send an audio query and receive system audio feedback to get the required in-door object.
Abstract 1: Inverse-graphics, such as recovering 3D information, which is lost during image capturing is a big challenge. 3D scene understanding or generating a high-quality 3D shape and texture has found vast applications in virtual reality, computer-aided geometric design, autonomous vehicles, robotics, medicine, etc. Traditional multi-view stereo and shape from shading based methods do not have shapes as prior, limited to small resolution, require volumetric fusion steps, etc. The learning-based model such a differentiable neural rendering has gained the attention of the research community which provides 3D shape/texture or novel views of the scene without 3D supervision. I am interested in designing a 3D scene understanding or 3D reconstruction model by leveraging a novel differentiable neural renderer and implicit representation. My objective is to recover a state of the art high-quality 3D reconstruction of real objects with single or multi-views. ==== Abstract 2: Crop diseases pose a major threat to the world food security, and their timely identification is of utmost importance. However, the prevalence of distortion and motion blur in the crop images during image acquisition make image enhancement a mandatory step. Recently, deep learning(DL)-based approaches have shown the state of the art results for image enhancement. However, one critical limitation of these DL approaches is the requirement of high-quality noiseless ground truth images that are difficult to obtain in many agricultural applications. To solve this issue, we leverage a recently proposed DL method that does not require ground-truth images to remove the motion blur. We show the effectiveness of our proposed approach on wheat and rice plant disease datasets.
"!!! HIGH-QUALITY DOCUMENTARY-STYLE VIDEO !!! Many tasks are involved in analyzing data from astro-particle telescopes. First of all, let me show you why we need these telescopes and why machine learning is important to them. Since all labeled data is obtained from simulations, how do we deal with data quality issues? How can we reduce the resource footprint of the analysis pipeline? Answers to these questions embrace imbalanced learning, domain adaptation, active class selection, deep learning, and probability calibration, among others."
Project C4 of the SFB 876 develops regression approaches for large-scale high-dimensional data. We present two of our recent works where the number of observations is much larger than the number of variables. The first one deals with constructing coresets in the context of logistic regression in a theoretical manner whereas the second one introduces an R package mrregression for frequentist and Bayesian linear regression on large data sets. Coresets are one of the central methods to facilitate the analysis of large data. Since the usual algorithms for logistic regression are too slow if the data is big, we first need to reduce the amount of data while preserving important information. This is exactly what a so called coreset does. In order to get a coreset we sample points with a probability proportional to its sensitivity score. While uniform sampling does not preserve important points, sensitivity sampling does. We first apply an appropriate sketching algorithm. Using the sketch we can compute a basis that we use to determine the sensitivity scores and at the same time sample the points in a second pass. The R package mrregression enables frequentist and Bayesian linear regression on large data sets, e.g. data that does not fit into memory. It is an implementation of the methodology described in Geppert et al. (2020) who transfer the framework of Merge & Reduce from data structures to statistical models. The slide presentation will include an overview of the package’s main functionalities as well as a short introduction to the Merge & Reduce framework. Upon demand, a live demonstration of the package is possible.
A multi-level system has been proposed to deal with the problem of fake license plates by integrating ALPR along with recognition of different other features such as vehicle color, make and model. YOLOv3 is applied for vehicle type detection and WPOD-NET is employed to rectify detected license plates, which are fed to an OCR. Several promising CNNs have also been trained and tested using Stanford Cars-196 dataset where Xception outperformed previous approaches with 96.7% accuracy. A novel deep neural network for vehicle color recognition has also been implemented, which is not only computationally inexpensive, but also outperforms other competitive methods.
BIRCH clustering is a widely known approach for clustering, that has influenced much subsequent research and commercial products. The key contribution of BIRCH is the Clustering Feature tree (CF-Tree), which is a compressed representation of the input data. As new data arrives, the tree is eventually rebuilt to increase the compression. Afterward, the leaves of the tree are used for clustering. Because of the data compression, this method is very scalable. The idea has been adopted for example for k-means, data stream, and density-based clustering. Clustering features used by BIRCH are simple summary statistics that can easily be updated with new data: the number of points, the linear sums, and the sum of squared values. Unfortunately, how the sum of squares is then used in BIRCH is prone to catastrophic cancellation. We introduce a replacement cluster feature that does not have this numeric problem, that is not much more expensive to maintain, and which makes many computations simpler and hence more efficient. These cluster features can also easily be used in other work derived from BIRCH, such as algorithms for streaming data. In the experiments, we demonstrate the numerical problem and compare the performance of the original algorithm compared to the improved cluster features.
Due to the increasing trend in the amount of heavy metals in Asia, North America, South America, Europe and Africa, innovative water purification technologies are demanded globally. A novel synthesis protocol of Nanocellulose and Nanocellulose/GO composite was developed and optimized to first prepare nanocellulose crystals employing inexpensive and minimum amount of hazardous chemicals that has potential to be scaled up at industrial level. This Nanocellulose was then combined with GO synthesized from Hummers method to produce composite solution, from which the composite membranes were produced. The composite membrane is expected to show resistance towards bio-fouling, higher mechanical strength and improved adsorption capacity. I further plan to employ Machine learning models such as SVM, kNN, Decision tree, Linear regression & SVM regression to develop intelligent nanocomposite membranes.
Spatio-temporal data sets such as satellite image series are of utmost importance for understanding global developments like climate change or urbanization. However, incompleteness of data can greatly impact usability and knowledge discovery. I will here present my recent approach for filling data gaps based on probabilistic machine learning. Experimental results on satellite data shall serve as an illustrative example.
Supervised Machine Learning (especially Decision Trees) offers good possibilities to describe data inherent knowledge intuitively, even without knowledge about Machine Learning. However, the training process of supervised Machine Learning algorithms requires trustworthy labels, which are not always given. To overcome this problem clustering algorithms can be used to separate good and defect products. These clusters can be used as labels within the training process of supervised Machine Learning algorithms. This interaction of clustering algorithms and supervised Machine Learning is especially needed where human process experts have to verify automatically generated inspection results: Since these verifications are prone to human errors, they are therefore not trustworthy as labels for supervised Machine Learning.
Crop diseases pose a major threat to the world food security, and their timely identification is of utmost importance. However, the prevalence of distortion and motion blur in the crop images during image acquisition make image enhancement a mandatory step. Recently, deep learning(DL)-based approaches have shown the state of the art results for image enhancement. However, one critical limitation of these DL approaches is the requirement of high-quality noiseless ground truth images that are difficult to obtain in many agricultural applications. To solve this issue, we leverage a recently proposed DL method that does not require ground-truth images to remove the motion blur. We show the effectiveness of our proposed approach on wheat and rice plant disease datasets.
The increase in the availability of wearable devices offers a prospective solu-tion to the increasing demand for elderly human activity monitoring, in the essence of improving the independent living standard of the growing popula-tion of elderly humans. With all the availability of the wearable devices fully embedded with sensors that are being used for human health monitoring, a lot of techniques are been proposed and used in the process. However, most of the publicly available datasets in use today, are collected in a fully con-trolled or semi-natural settings. Also, elderly peoples from rural areas and transitional activities are mostly not considered, which will cause a lack of generalization of the dreamed HAR models. The purpose of this research is to collect a new dataset from elderly peoples in a rural area and find the best sensor position among ankle and waist by subjecting the newly collected da-tasets to different machine learning classifiers. Sliding window technique with 50% overlapping was used to segment the sensor data collected from the elderly subjects. Relevant features were extracted, and selected using the wrapper method. From the results obtained, it has shown that the sensor at-tached at the waist position yield a better result compared to the ankle posi-tion on the newly collected elderly data. KNN algorithm has the highest ac-curacy level in both cases compared to the remaining tested classifiers.
The amount and quality of training data is crucial for the performance of machine learning algorithms. Simulation data are most often used for testing the trained models or for an additional annotation, e.g., assigning labels to observed data. They can be considered as background knowledge for domain experts. We want to integrate this knowledge into the machine learning process and, at the same time, use the simulation as an additional data source. We show an overview of the work that has been carried out in the subproject B3 of SFB 876 “Data Mining on Sensor Data of Automated Processes” , to fuse simulation and sensor data to optimize milling processes.
Haptic perception has been a crucial aspect of human capability to assess the object properties they are presented with. Two fundamental input types -cutaneous and kinesthetic- have been proved to be the basis of the perception of these objects. Although roughness had been studied widely and some models attempted to explain this human perception modality, hardness and softness along with the cutaneous inputs need to be investigated further. Since studies showed that hardness/softness perception relies on the contact area and the deformation of the object, we investigated the relation between objects with different hardness attributes and the force applied on these objects, the estimation of the applied force and estimation of hardness. In an experimental setting, we asked participants to evaluate eight objects with different softness properties by touching and then assessing the force they applied. We also measured the applied force during the tactile exploration of the objects. In this study we showed that when presented with a hardness estimationdiscrimination task, participants tend to apply a stronger force on the hard objects and a weaker force on the soft objects with precisive recognition of the force they applied. In this aspect we modelled the relationships between different parameters of the data using regressive models.
Time series analysis is often required in daily tasks. Neural networks are known to extract features automatically. Can neural networks extract time features? Can the extracted features compete with conventional methods? These questions are tackled from the song analyst perspective.
The communication between data-generating devices is partially responsible for a growing portion of the world's power consumption. Thus reducing communication is vital, both, from an economical and an ecological perspective. For machine learning, on-device learning avoids sending raw data, which can reduce communication substantially. Furthermore, not centralizing the data protects privacy-sensitive data. However, most learning algorithms require hardware with high computation power and thus high energy consumption. In contrast, ultra-low-power processors, like FPGAs or micro-controllers, allow for energy-efficient learning of local models. Combined with communication-efficient distributed learning strategies, this reduces the overall energy consumption and enables applications that were yet impossible due to limited energy on local devices. The major challenge is then, that the low-power processors typically only have integer processing capabilities. % This paper investigates an approach to communication-efficient on-device learning of integer exponential families that can be executed on low-power processors, is privacy-preserving, and effectively minimizes communication. The empirical evaluation shows that the approach can reach a model quality comparable to a centrally learned regular model with an order of magnitude less communication. Comparing the overall energy consumption, this reduces the required energy for solving the machine learning task by a significant amount.
A long standing and challenging goal in neuroscience research is to understand how visual stimulus is represented in the brain and decoding these sensory inputs from brain activity. This enhances our understanding of how sensory information is encoded in the brain which in turn offers insights into how the brain is processing information.
We develop a neural decoding pipeline to reconstruct a person’s visual experience, we use deep generative models to train a custom Variational Autoencoder(VAE) for reconstruction of the visual stimuli from the functional Magnetic Resonance Imaging(fMRI) data. We propose using the information encoded in the latent space of the autoencoder network to approximate the feature representation of the perceived visual stimuli; the high dimensional encoding of information in this latent space is instrumental in creating mappings between the combined space of fMRI patterns and this latent space. These mappings are then used for image reconstruction by predicting latent vectors from given fMRI patterns. We also explore the feasibility of an alternate approach for reconstruction where we generate the latent vectors using VAE and then use GANs to learn the mapping of images and their fMRI patterns. These mappings are then used for image reconstruction by predicting latent vectors from given fMRI patterns.
Our neural decoding pipeline is successfully able to untangle and extract semantically meaningful information from brain activity patterns. Lastly we investigate whether the human brain representations are homologous to the latent space of deep generative neural networks. Our results suggest that it may be possible to reconstruct a picture of a person’s visual experience from measurements of brain activity alone.
The Student's Corner will take place on the Summer School's Discord server. The link to the server will be handed
out to registered participants shortly before the Summer School.