Springe zum Hauptinhalt

Forschungsseminar

Forschungsseminar

Das Forschungsseminar richtet sich an interessierte Studierende des Master- oder Bachelorstudiums. Andere Interessenten sind jedoch jederzeit herzlich willkommen! Die vortragenden Studenten und Mitarbeiter der Professur KI stellen aktuelle forschungsorientierte Themen vor. Vorträge werden in der Regel in Englisch gehalten. Den genauen Termin einzelner Veranstaltungen entnehmen Sie bitte den Ankündigungen auf dieser Seite.

Informationen für Bachelor- und Masterstudenten

Die im Studium enthaltenen Seminarvorträge (das "Hauptseminar" im Studiengang Bachelor-IF/AIF bzw. das "Forschungsseminar" im Master) können im Rahmen dieser Veranstaltung durchgeführt werden. Beide Lehrveranstaltungen (Bachelor-Hauptseminar und Master-Forschungsseminar) haben das Ziel, dass die Teilnehmer selbstständig forschungsrelevantes Wissen erarbeiten und es anschließend im Rahmen eines Vortrages präsentieren. Von den Kandidaten wird ausreichendes Hintergrundwissen erwartet, das in der Regel durch die Teilnahme an den Vorlesungen Neurocomputing (ehem. Maschinelles Lernen) oder Neurokognition (I+II) erworben wird. Die Forschungsthemen stammen typischerweise aus den Bereichen Künstliche Intelligenz, Neurocomputing, Deep Reinforcement Learning, Neurokognition, Neurorobotische und intelligente Agenten in der virtuellen Realität. Andere Themenvorschläge sind aber ebenso herzlich willkommen!
Das Seminar wird nach individueller Absprache durchgeführt. Interessierte Studenten können unverbindlich Prof. Hamker kontaktieren, wenn sie ein Interesse haben, bei uns eine der beiden Seminarveranstaltungen abzulegen.

Vergangene Veranstaltungen

Investigation of Reward-Guided Plasticity in Recurrent Neural Networks for Working Memory Tasks

Max Werler

Thu, 16. 5. 2024, 1/368 and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

A variety of recent works have indicated that Perturbation Learning provides a valid approach to train the recurrent weights of an artificial recurrent neural network. In this context, Miconi (2017) demonstrated how the thereby induced deflections of the excitation of the neurons can be captured based on information that is locally available to synapses in a Hebbian manner and how these can be integrated into a reward-guided weight update rule such that it characterizes a biologically plausible training algorithm that is capable to solve cognitive tasks. However, his learning architecture suffers from a significant flaw that effectively can prohibit learning progress for some random initial conditions and disrupt a successfully converged network towards an error level like it was observed before training. In this thesis, we investigate these scenarios intending to find the underlying network properties that cause this undesired behavior. As low intrinsic activity and imbalance between excitation and inhibition were detected to strongly correlate with these phenomena, related learning rules as well as a different input weight initialization scheme have been proposed and evaluated. While our results show that we were able to enhance the speed as well as reliability of the initial network convergence greatly, the possibility for a sudden deterioration of a temporarily successful network remains existent.

Comparison of two motor learning models of the basal ganglia and cerebellum

Christoph Ruff

Thu, 2. 5. 2024, 1/368 and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

A model of motor learning, which was developed here on the professorship by Baladron et al [1] will be compared with a motor learning model from Todorov et al [2]. Both models use a model of the basal ganglia (BG) and the cerebellum (CB). The BG chooses a certain action and reaches to that location, while the CB fine-tunes the reached location to come closer to the target or adapt the movement to an altered location. Both models have a different structure and therefore function differently. During the seminar I will give a closer overlook on how these two models differ and what are advantages/disadvantages of them. The tasks with which these two models were trained differ as well. As part of my internship, I trained the model from Baladron et al [1] with the tasks of Todorov et al [2] to see which parameter values and adaptations are necessary to get a similar result and if it even is possible to train the model with these tasks. The results of this implementation will be presented as well.

Neuromorphic Computing as a Low Power, Minimal Footprint Solution for Dynamical System Control

Valentin Forch

Thu, 18. 4. 2024, 1/368 and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

In cooperation with the Research Center for Materials, Architectures and Integration of Nanomembranes (MAIN) we develop a neural network architecture that will enable the control of autonomous modular micro robots carrying CMOS chiplets. Realizing a neural network controller on this scale poses multiple challenges: the machine must run on a minimal energy budget, minimal memory footprint, and can only be build on top of a low-level instruction set. Further, these machines should in principle be able to adapt to changing environments without a complete re-training.
To answer these challenges, we start by optimizing recurrent spiking neural networks for simple motor control benchmarks in an evolutionary framework. The networks possess a simplified integrate-and-fire neuron model and ultra low resolution synapses. We further reduce the memory footprint by optimizing only subsets of the network connectivity matrix and by controlling multiple network weights through singular bits. By controlling the neural network and resulting motor activity, we find well-performing solutions that minimize energy consumption. Lastly, we introduce a novel approach for optimizing agents through highly parallelized swarm evolution.

ANNarchy User Forum

Helge Ülo Dinkelbach, Julien Vitay

Thu, 11. 4. 2024, 1/368 and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Topics will be:

  1. ANNarchy 4.7.3
    • Introduction to the features and extensions since 4.7.0
  2. Spotlight features and discussion
    • Float-precision handling in ANNarchy
    • Auto-tuning in ANNarchy, the new default?
  3. Future developments
  4. Resources overview
  5. Open forum
    • Removal of the MagicNetwork
    • Discussing the future network interface and handling
    • Discussing your feature requests and ideas

Auswirkungen tiefer Hirnstimulation auf gewohnheitsmäßiges Lernen der Basalganglien

Dave Apenburg

Tue, 12. 3. 2024, 367 and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Tiefe Hirnstimulation (DBS) ist eine wirksame Behandlungsmethode, um Bewegungsstörungen bei Dystonie-Patienten zu lindern. In einer Studie von De A Marcelino et al. (2023) wurde dazu untersucht, wie Dystonie-Patienten mit ein- oder ausgeschalteter DBS-Elektrode im Globus Pallidus pars interna (GPi) der Basalganglien (BG) eine Belohnungsumkehraufgabe lösen. In dieser Arbeit wird diese Studie mit dem Basalganglienmodell von Villagrasa et al. (2018) nachgebildet. Da die Auswirkungen von DBS weitgehend unerforscht sind, wurden vier bestehende Theorien (1) Hemmung lokaler Neurone, (2) Stimulation efferenter Axone, (3) Stimulation afferenter Axone und (4) Stimulation vorbeilaufender Fasern eingebunden und untersucht. Aus einer bisherigen Studie (Baladron & Hamker, 2020) geht außerdem hervor, dass eine plastische Verbindung (Shortcut) vom Cortex zum Thalamus gewohnheitsmäßiges Lernen in den Basalganglien unterstützt. Durch einen Wechsel zwischen einem festen und einem plastischen Shortcut konnte diese Aussage validiert werden. Außerdem wurde nach dem Umkehrlernen ein Unterschied in der Anzahl gewohnter Entscheidungen zwischen den DBS-Varianten und eine höhere Anzahl gewohnter Entscheidungen mit eingeschaltetem DBS als ohne DBS festgestellt. Diese Arbeit soll somit zu einem besseren Verständnis des Einflusses von DBS auf die BG-Schaltkreise und gewohnheitsmäßigem Lernen in den Basalganglien beitragen.

Optimierung der Neuromodellparameter eines Spiking Netzwerkes mit Neuroevolution

Tom Maier

Thu, 22. 2. 2024, 367 and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

In dieser Bachelorarbeit geht es um die Verbesserung der Leistung von konvertierten Spiking Neural Networks (SNNs) um deren besonderen Eigenschaften ausnutzen zu können und gleichzeitiges eine hohe Genauigkeit zu erzielen. Dafür werden bestimmte Parameter der SNNs nach der Konvertierung und Normalisierung durch die Methode von Diehl et al. aus einem Multi-Layer Perceptrons mithilfe der Covariance Matrix Adaption-Evolutionsstrategie (CMA-ES) in einem Evolutionsprozess optimiert. Der Erfolg der Evolution wurde dabei über die Genauigkeit der Klassifizierung auf dem Fashion-MNIST Datensatz gemessen, welcher auch für die Werte des Ausgangsnetzwerks und des rein konvertierten SNN verwendet wurde. Um verschiedene Effekte der Daten auf die Evolution und die Leistung des SNN zu prüfen, wurden verschiedene Konfigurationen der Größe des Datensatzes und der beinhalteten Elemente in einzelnen Durchläufen verwendet. Die Evolution der Parameter ermöglichte eine Verbesserung in den Klassifizierungen auf ein vergleichbares Niveau wie das des ursprünglichen MLP-Modells. Also erfolgte eine starke Verbesserung der Leistung gegenüber des ausschließlich konvertiert und normalisierten SNN.

Time Series Forecasting of Cashflow Data using Deep Learning

Preksha Gampa

Mon, 22. 1. 2024, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Time series forecasting is a pivotal technique in the analysis of business operations and resource availability. It is widely used across several industries to predict future events, thereby assisting in crucial and data-driven decision-making processes. At Mercedes-Benz Mobility AG, an application called `myCashflow' is used to provide the daily forecasts of cash positioning for various Mercedes-Benz entities all around the Africa and Asia Pacific (AAP) region. This application currently relies on machine learning and traditional statistical models for the analysis and forecasting of the cashflow data. These models enable the application to capture the inherent patterns present in the time series data and generate high-precision forecasts. However, with the advancement of deep learning techniques, there is a potential for enhancing the forecasting capability of the myCashflow application, thereby assisting in better decision-making. This research focuses on exploring deep learning methodologies for forecasting cashflow data while addressing the challenges of high data fluctuations, short-length time series, and potential outliers. Three advanced deep learning methodologies are explored, namely Convolutional Neural Networks (CNN), Ensemble Empirical Mode Decomposition combined with CNN (EEMD-CNN), and Transfer learning with CNN. A comprehensive evaluation and comparison of the employed deep learning methodologies with established machine learning and statistical models are undertaken to identify the most effective and efficient approach for enhancing the predictive accuracy of cashflow forecasts.

Erklärbarkeit von Modellen maschinellen Lernens und Anwendung auf die 2. Fußballbundesliga

Simon Schulze

Thu, 18. 1. 2024, 1/309 and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Methoden der eXplainable Artificial Intelligence (XAI) sind unabdingbar, um komplexe und undurchsichtige Algorithmen der künstlichen Intelligenz (KI) beziehungsweise des maschinellen Lernens (ML) erklärbar und verständlich zu machen. Zudem werden KI- und ML-Algorithmen immer häufiger im Bereich der Datenanalyse im Fußball verwendet. Diese Seminararbeit untersucht den Einsatz zweier XAI-Verfahren am Beispiel der 2. Fußballbundesliga, um für den Spielausgang ausschlaggebende Statistiken zu identifizieren. Partial Dependence Plots (PDPs) und Shapley Werte sollen erläutert und auf Modellen, welche auf Datensätzen der 2.Fußballbundesliga trainiert wurden, angewandt werden. Die Vorhersagen der Modelle sollen dadurch einen höheren Grad an Erklärbarkeit und Nachvollziehbarkeit erhalten. Mit Hilfe von PDPs kann der Zusammenhang bestimmter Statistiken und des erwarteten Spielausgangs analysiert werden, während Shapley Werte einen Einblick in den individuellen Beitrag von Merkmalen zum Endergebnis ermöglichen. Die gewonnen Erkenntnisse sollen einen tieferen Einblick in die Schlüsselfaktoren, die den Ausgang einer Partie maßgeblich beeinflussen, geben. Diese Forschung trägt zur wachsenden Disziplin der XAI, unter anderem im Bereich der Datenanalyse im Fußball, bei und verdeutlich das Potential, komplexe sportliche Ereignisse systematisch zu entschlüsseln.

Open-ended optimization of recurrent neuromorphic architecture through neuroevolution

Martina Kraußer

Tue, 16. 1. 2024, 1/367 and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

While deep neural networks achieve human-level performance in some tasks, their energy consumption if implemented on von Neumann architectures is orders of magnitude above the brain. This motivates research in neuromorphic hardware which has significant less energy consumption by operating neural networks. Frenkel et al. (2018) presented ODIN neuromorphic chip, a digital spiking neuromorphic processor with minimal size and energy consumption. However, its simplified computing architecture does not allow a straight-forward application of gradient-based deep learning techniques. Neuroevolution, a subfield of AI, is employed to train neural networks through evolutionary algorithms, avoiding gradient-based modification of individual weights. This can even reach better and faster results for tasks with high uncertainty about their destination, like playing games or movement control. The Paired Open-Ended Trailblazer (POET) algorithm, introduced by Wang et al. (2019), is surpassing traditional neuroevolutionary approaches that only focus on the adaption from agents in a fixed environment by simulating dynamically changing environments over the time. POET strategically starts with simpler tasks, building skills hierarchically and expediting problem-solving abilities. POET algorithm is used to assess the learning capabilities of the ODIN neuromorphic chip in mastering the game of Pong. The objective is to explore whether the ODIN chip exhibits a general enhancement across various game parameters, contributing to a broader understanding of its adaptability and performance in its adaptability and performance in diverse pong gaming scenarios and therefore in the task of systematically controlling movements.

Goal-directed control over action in a hierarchical model of multiple basal ganglia loops

Manuel Rosenau

Thu, 21. 12. 2023, 1/309 and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

In the human brain, multiple cognitive functions are related to the prefrontal cortex, the basal ganglia, the thalamus, and their organization into cortical-basal ganglia thalamic-cortical loops. While the prefrontal cortex seems to be associated with goal-directed behaviour through cognitive control mechanisms, the basal ganglia with its dopaminergic inputs and its reward prediction error is often seen as a reinforcement learning circuit, affiliated with habitual behaviour. Goal-directed behaviour is associated with planned, purposeful behaviour, connecting actions to specific outcomes, where habitual behaviour involves actions that are often repeated and rewarded in such ways that stimuli automatically trigger them. There are still uncertainties about the specific mechanisms within the cortical-basal ganglia thalamic-cortical loops that cause a particular behaviour, and whether this can be compared to model-based or model-free behaviours. Baladron and Hamker (2020) proposed a hierarchical organization of basal ganglia loops, where each loop learns using its own prediction error signal and computes an objective for the next loop. They tested their ideas with a computational model consisting of two dorsal loops. Later the model was enhanced by adding a ventral loop prior to the dorsal loops, to select a goal signal out of an existing memory. The present thesis focuses on the question, which internal structures of the cortical-basal ganglia thalamic-cortical loops could be responsible for letting the human brain establish a habit and which ones are responsible for breaking out of habit. As a result, a long indirect pathway and a shortcut between the sensory input and the thalamus in the dorsomedial loop were implemented in the three-loop model. The different approaches were then tested and compared by performing the two-stage Markov Decision task proposed by Gillan et al. (2015).

Representational alignment

Payam Atoofi

Thu, 14. 12. 2023, 1/309 and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Any model, or system that serves as an information processing function ultimately provides a state given a stimulus, e.g., image, text, video, etc. These states can then be measured and perhaps mapped to an embedding. Assuming the output space of an information processing function (or in case of an embedding, the embedding space of the process), as its representational space, the extent of similarity or dissimilarity between representational spaces of two seemingly different systems is referred to as representational alignment. There is a plethora of studies in the fields of cognitive science, neuroscience, and machine learning, where representations of the systems are explored to show and/or impose an alignment, under the assumption that there should exist a certain degree of similarity or dissimilarity. Sucholutsky, et al. (2023) have proposed a framework that describes representational alignment in a unified and concise manner, which envelops all the previous methods from the aforementioned disciplines. Their framework helps in not only to have a better overview and understanding of the underlying principle of such approaches and their inspirations, but also to be aware of the challenges and pitfalls, e.g., where representational alignment could hinder the performance, or a representational alignment is not a valid assumption and therefore should be detected. Thus, studies in cognitive science, e.g., behavioral studies, semantic cognition, human-machine alignment, etc., or in neuroscience, e.g., brain activity functions across species or across individuals, multi-modality across brain regions, etc., or in machine learning, e.g., knowledge distillation, interpretability, semantically meaningful representation, etc., could hopefully benefit from any interdisciplinary development of representational alignment.

Consciousness in Artificial Systems: Bridging Sensorimotor Theory and Global Workspace in In-Silico Models

Nicolas Kuske

Thu, 30. 11. 2023, 1/309 and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

In the aftermath of the success of attention-based transformer networks, the debate over the potential and role of consciousness in artificial systems has intensified. Prominently, the global neuronal workspace theory emerges as a front-runner in the endeavor to model consciousness in computational terms. A recent advancement in the direction of mapping the theory onto state-of-the-art machine learning tools is the model of a global latent workspace. It introduces a central latent representation around which multiple modules are constructed. Content from any one module can be translated to any other module and back with minimal loss. In this talk we lead through a thought experiment involving a minimal setup comprising one deep sensory module and one deep motor module, which illustrates the emergence of latent sensorimotor representations in the intermediate layer connecting both modules. In the human brain, law-like changes of sensory input in relation to motor output have been proposed to constitute the neuronal correlate of phenomenal conscious experience. The underlying sensorimotor theory encompasses a rich mathematical framework. Yet, the implementation of intelligent systems based on this theory has thus far been confined to proof-of-concept and basic prototype applications. Here, the natural appearance of global latent sensorimotor representations links two major neuroscientific theories of consciousness in a powerful machine learning setup. As one of several remaining questions it may be asked, is this artificial system conscious?

Untersuchung von Finetuning und Transfer Learning mit Transformer-Architekturen für Altersklassifizierung von Texten

Sebastian Neubert

Mon, 20. 11. 2023, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Die vorliegende Masterarbeit untersucht Finetuning und Transfer-Learning für die Altersklassifikation von Texten mit Hilfe von BERT-basierten Modellen. Vier zentrale Forschungsfragen werden durch verschiedene Experimente beantwortet. Zunächst wurde der Einfluss der Chunk-Größe beim Zerlegen des Datensatzes auf die Modellperformance untersucht, wobei größere Chunks eine bessere Gesamtleistung aufwiesen. Die Untersuchung von Klassifikationsschichten zeigte, dass LSTM-basierte Schichten für diesen Anwendungsfall zu besseren Resultaten führen als lineare Schichten. Das Einfrieren von BERT-Schichten während des Trainings zeigte die Bedeutung der Kapazität der Klassifikationsschicht für optimale Genauigkeiten. Schließlich ergab das Transfer-Learning mit zuvor auf verschiedenen Datensätzen trainierten Modellen keine Leistungssteigerung auf dem Audory-Datensatz, aber ein tieferes Verständnis der Vorhersagecharakteristika. Trotz einiger Einschränkungen beleuchtet diese Arbeit die Tiefe und Feinheiten der Altersklassifikation in Texten und liefert wertvolle Einblicke und Grundlagen für zukünftige Forschungen auf diesem Gebiet.

Replication of Morris et al. 2016 and resolution proposal for prominent seemingly conflicting in vivo measurements of LIP gain-field activity

Nikolai Stocks

Thu, 16. 11. 2023, 1/309

In 2012 Morris et al. produced a set of in vivo direct measurements of the Area LIP, VIP, and MST. When viewed at the population level, these neatly align with observed patterns of perisaccadic mislocalization, as with the broad outline of the model of perisaccadic mislocalization first developed by Ziesche and Hamker. In 2016, they then trained an intentionally simple set of integrators using linear regression on their recorded 2012 dataset, showing that their Dataset allows for the easy and accurate prediction of the future eye position up to 100ms before saccade onset, and the accurate recollection of past eye positions up to 200ms after saccade onset. Meanwhile, Xu et al. 2012 also recorded LIP activity in vivo, providing a set of measurements in contradiction with the recordings of Morris, calling into question the LIP gain-field as a viable explanation for perisaccadic visual stability. I will present one possible resolution for the apparent contradiction between the datasets of Xu et al. 2012 and Morris et al. 2012 while replicating the same kind of prediction and recollection as shown in Morris et al. 2016 using data generated by an iteration of Ziesche and Hamkers model.

Dataset Refinement for Object Detection by leveraging the Training Dynamics of Deep Neural Networks

Sujay Jadhav

Mon, 30. 10. 2023, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Object detection involves locating the instances of semantic objects in the image and classifying them correctly. The presence of incorrect class labels, incorrect bounding boxes, low resolution, unusual viewpoint, and heavy occlusion harm the detection performance. In large object detection datasets, filtering out instances with these errors is tedious and often requires manual inspection. Our goal is to identify and prune such erroneous and uninformative instances from the object detection datasets. A promising data-centric approach to diagnosing a labeled dataset at the instance level is to monitor the dynamics of the model trained on it. This thesis adopts the dataset maps (Swayamdipta et al., 2020) from text-classification tasks to object detection tasks. By diagnosing the dataset maps for the object detection dataset, this thesis provides a proof-of-concept for object-level pruning of large detection datasets. The position of the objects on the dataset map indicates its difficulty for the model in three categories - easy, ambiguous, or hard to learn. To further filter out the object instances that do not contribute to learning, other training dynamics such as correctness - how often the object is correctly classified, and forgetfulness - how often the model misclassifies the object after correctly classifying it, are explored to find out their potential for object pruning. The thesis shows that collectively pruning some easy and most of the hard-to-learn objects can improve the detection performance of the model, with less data.

Predicting Remaining Useful Lifetime of Solenoid Valves Using Machine Learning Techniques

Syed Anas Ali

Mon, 25. 9. 2023, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

FESTO VYKA valve is used in many industries and plays a pivotal role in industrial control, automation and pneumatics. These valves degrade with time and their usage. This thesis uses artificial intelligence and deep learning to research and find methods to predict the Remaining Useful Life (RUL) in the number of switches remaining until failure. RUL prediction is extremely useful in the predictive maintenance domain. It is a crucial concern in prognostic decision-making and saves costs and time. RUL is calculated using only measurements from current and voltage sensors, and PCA-based cascaded LSTMs and convolutional LSTM-based approaches are compared. The data for this thesis was gathered from endurance tests explicitly designed for the failure of these valves. Convolution kernel integrated with cascaded LSTMs yielded better RUL estimations than the PCA-based deep learning approaches. Two-dimensional convolutions were able to capture the features of ageing valves, and they gave the best results when these were integrated with cascaded LSTM. Separating the failure types improved the RUL estimations. However, current RUL estimations could be improved by gathering more valve data and generalising even more.

Klassifizierung von manuellen ACC-Abbrüchen im Fahrzeugtest mittels Deep Learning

Victoria Nöther

Wed, 13. 9. 2023, 1/336 und https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Fahrerassistenzsysteme werden zunehmend im Straßenverkehr eingesetzt. Sie müssen zuvor allerdings vollumfänglich getestet werden. Viele dieser Tests müssen im Fahrzeug stattfinden, da spontane menschliche Fahrentscheidungen nur schwer simuliert werden können. Fahrzeugtests sind allerdings sehr kosten- und zeitaufwendig. In dieser Arbeit wird untersucht, wie menschliche Fahrentscheidungen automatisiert klassifiziert werden können. Diese Arbeit konzentriert sich nur auf den Abstandsregeltempomat und dessen manuelle Abbrüche, welche durch den Fahrer durchgeführt werden. Es werden dazu Daten aus dem Fahrzeugtest zum Trainieren von Deep Learning-Modellen verwendet. Dazu werden fünf Deep Learning-Modelle untereinander verglichen: eindimensionale Convolutional Neural Network, Long Short-Term Memory, Gated Recurrent Unit und deren Kombinationen. Es wird ein F1-Score von 93,88% für das Convolutional Neural Network-Modell erzielt. Die anderen Modelle erzielen F1-Scores von 94,57%, können allerdings Abstandsregeltempomat-Abbrüche nicht korrekt prädizieren. Das Convolutional Neural Network-Modell ist als einziges erfolgreich in der korrekten Klassifizierung der Abbrüche, was anhand seiner Confusion Matrix abzulesen ist. Diese Arbeit zeigt, dass der Einsatz von Deep-Learning Methoden Erfolg bei der Klassifizierung menschlicher Fahrentscheidungen haben kann.

Parkinsonian beta oscillations in the pallido-striatal loop influencing action cancellation

Iliana Koulouri

Wed, 16. 8. 2023, 1/273 und https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Die Parkinson-Krankheit (PD) ist eine neurodegenerative Erkrankung, welche durch demVerlust von Dopamin-Neuronen verursacht. Unter dopaminarmen Bedingungen, wie PD, tritt synchronisierte oszillatorische Aktivität im Beta Frequenzband in den Basalganglien (BG) auf. Diese werden wiederum mit der Kontrolle motorischer Handlungen assozziiert. Die Quelle dieser Oszillationen ist bis heute unbekannt. Neben den Theorien zur Generation von Beta Oszillationen in der STN-GPe Schleife, dem Striatum oder dem Kortex. Die pallido-striatale Schleife besteht aus GPe Neuronen, schnell spikenden striatalen Interneuronen (FSIs) und STRD2 Neuronen und ist ebenfalls eine weitere mögliche Quelle von Beta Oszillationen bei PD. Corbit et al.(2016) zeigten, dass ein Modell der pallido-striatalen Schleife fähig ist unter dopaminarmen Bedingungen Beta Oszillationen zu erzeugen und zu verstärken. Dieser Ansatz ,sowie ein biologisch plausibles FSI Neuronenmodell wurden in das exisistierende BG Modell (Gönner et al, 2020) übertragen, um zu testen, wie sich Beta-Oszillationen im gesamten BG Netz ausbreiten können und welchen Einfluss sie auf die Performance in einer Stopp Signal Aufgabe haben. Nach aktuellem Arbeitsstand scheint es als setzen sich nicht Beta, sondern eher Theta Oszillationen im Netz fort.

Berechnung des Brain-Scores für biologisch motivierte neuronale Netze

Elina König

Tue, 8. 8. 2023, 1/273

Der Brain-Score ist eine metrische Bewertung, die die Leistung neuronaler Netzwerke im Vergleich zum menschlichen Gehirn quantifiziert. Die Online-Plattform Brain-Score.org spezialisierte sich auf diese Bewertung und den Vergleich von Netzen. Die Forscher ermöglichen es Modelle hochzuladen, dabei wird die Leistung der Einreichungen anhand von verschiedenen Benchmark-Aufgaben bewertet. Sie umfassen typische visuelle Reize, auf die das menschliche Gehirn reagieren kann. Der Fokus der Arbeit liegt auf der Fähigkeit des ANNarchy-Modells, neuronale Reaktionen auf visuelle Reize vorherzusagen, die in den frühvisuellen Arealen beobachtet wurden. Das Netz wird in Hinblick auf seine Vorhersagegenauigkeit mit bereits eingereichten Modellen wie AlexNet, VGG16 und ResNet verglichen. Die Untersuchung zeigt Einblicke in die Grenzen und Herausforderungen der Modellierung und mögliche Strategien zur Verbesserung der Leistung.

Entwicklung eines graphbasierten Empfehlungssystems basierend auf Kunde- Produkt-Beziehungen und unter Verwendung einer Big Data Plattform

Haitham Almoughraby

Mon, 24. 7. 2023, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Der bipartite Graph ist eine verbreitete Struktur, um eine Beziehung zwischen zwei Populationen herzustellen und wurde zuletzt für persönliche Empfehlungen verwendet. Die Ein-Modus-Projektion- Methode ist dazu geeignet, eine Ebene aus dem bipartiten Graphen zu projizieren, sodass man einen Graphen erhält, der nur aus einer Knotengruppe besteht. Dessen Kanten entstehen zwischen den Knoten, wenn bestimmte Voraussetzungen erfüllt sind, z. B., wenn zwei Knoten im originalen Graphen mindestens mit einem Knoten aus der anderen Ebene verbunden sind, die als ähnliche Nachbarn bezeichnet werden. Da durch diese Projizierung viele Informationen verloren gehen, ist eine Gewichtungsmethode notwendig, um so viele originale Informationen wie möglich beizubehalten In dieser Masterarbeit wurden unterschiedliche Algorithmen verglichen, die bei der Projektion gewichtete Ein-Modus-Graphen erzeugen. Basierend auf diesen Graphen wurde ein Empfehlungssystem entwickelt, das sich nur auf die Beziehungen zwischen den Produkten fokussiert, und es wurde dessen Güte gemessen. Da die Struktur des Graphen eine zentrale Rolle spielt, wird der Einfluss des Gewichts auf den Graphen beobachtet und die Community-Detektion des Graphen überprüft.

Generalized Event Discovery with end-to-end learning for Autonomous Driving

Chaithrashree Moganna Gowda

Fri, 14. 7. 2023, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Safety conformity is one of the major requirement for Autonomous Driving, and even with continuous improvement in AI technologies this requirement is a must before deploying the vehicle on roads in an open context. To increase safety assessment and react in a timely manner to external events, the vehicle must know what kind of situation it is currently in. Hence it is important to analyze the series of events happening in the surroundings of the driving environment. An already existing approach for this problem of Time-to-Event prediction i.e., to predict the events even before they occur is Future Event Prediction in the video data (Neumann et al., 2019). In this paper, two key challenges were addressed: whether a certain type of event is happening, such as a car stopping, and if so, when the event is going to happen. In the past, this state-of-the-art work on Event Prediction was extended to a set of event classes such as Sudden Brake, Turn, and Anomaly. In this work, the data was prepared in a Weakly Supervised approach before the model development, by using synchronized sensor measurement data. For every similar task mentioned above, there is a huge amount of real-world driving data available, collected with various sensors in an autonomous vehicle. To effectively utilize this collected data, it is necessary to interpret the elements in the data to categorize or label them. Labelling this data for Autonomous Driving is a tedious process. The aim is to utilize this unlabelled data to improve the artificial intelligence algorithms developed for Autonomous Driving. Hence there is a necessity to move from a Weakly Supervised approach to Semi-Supervised Learning to analyze the series of events. A new approach called Generalized Category Discovery (GCD) (Vaze et al., 2022), wherein given a labelled and unlabelled set of images, the task is to categorize all images in the unlabelled set. The unlabelled images may come from labelled classes or from novel ones. This existing state-of-the-art approach of Generalized Category Discover is adapted as a baseline for Generalized Event Discovery (GED) i.e., to categorize unlabelled video sequences. The main contribution of this work is to discover various known and unknown event classes in unlabelled data set. A 3D ConvNet architecture used in Future Event Prediction has been used as a backbone in this experiment to extract the feature vector for the events. An Event Estimation model designed to categorize different event class is also been used as a backbone to extract the feature vector for the events and a comparison between these two models are made. Clustering has been directly performed in the feature space, using an effective Semi-Supervised K-means++ Clustering method introduced in GCD to cluster the unlabelled data into known and unknown classes automatically. The proposed Generalized Event Discovery (GED) method has been evaluated on the standard BDD100k benchmark dataset and on the real-time in-house ADAS data.

Surround delay leads to spatiotemporal behavior in the early visual pathway - part 2

René Larisch

Tue, 11. 7. 2023, 1/375

The spatial receptive fields along the early visual pathway - from the retinal ganglion cells (RGC) over the lateral geniculate nucleus (LGN) to the primary visual cortex (V1) - are not fixed but have shown a change of their structure during a short time interval. This temporal behavior adds the time dimension to the 2D spatial receptive field and leads to spatiotemporal receptive fields (STRF), suggesting processing for time-relevant visual stimuli, like the directional movement of an object. Different studies on cats and mice have shown that STRFs can differ in their appearance from a more separable to a more inseparable characteristic. Despite the fact, that there exists a vast amount of models to simulate STRFs and investigate their functionality (especially for direction selectivity), most of these models use a non-linear function to implement the change of the spatial receptive field over time, not explaining how STRFs emerge on a more neuronal level. Additionally, recent models indicate a more important role for intercortical inhibition for direction selectivity. In this talk, I give a short overview of different V1 simple cell models of STRFs and direction selectivity. Further, I will show how a delay in the surrounding field of the RGCs, an assumption reported previously, leads to STRFs in the LGN and the V1/L4 simple cells and how the characteristics of STRFs are influenced by so-called lagged LGN cells. At the end of the talk, I will discuss how direction selectivity is influenced by lagged LGN cells and intercortical inhibition.

Surround delay leads to spatiotemporal behavior in the early visual pathway

René Larisch

Tue, 6. 6. 2023, 1/375

The spatial receptive fields along the early visual pathway - from the retinal ganglion cells (RGC) over the lateral geniculate nucleus (LGN) to the primary visual cortex (V1) - are not fixed but have shown a change of their structure during a short time interval. This temporal behavior adds the time dimension to the 2D spatial receptive field and leads to spatiotemporal receptive fields (STRF), suggesting processing for time-relevant visual stimuli, like the directional movement of an object. Different studies on cats and mice have shown that STRFs can differ in their appearance from a more separable to a more inseparable characteristic. Despite the fact, that there exists a vast amount of models to simulate STRFs and investigate their functionality (especially for direction selectivity), most of these models use a non-linear function to implement the change of the spatial receptive field over time, not explaining how STRFs emerge on a more neuronal level. Additionally, recent models indicate a more important role for intercortical inhibition for direction selectivity. In this talk, I give a short overview of different V1 simple cell models of STRFs and direction selectivity. Further, I will show how a delay in the surrounding field of the RGCs, an assumption reported previously, leads to STRFs in the LGN and the V1/L4 simple cells and how the characteristics of STRFs are influenced by so-called lagged LGN cells. At the end of the talk, I will discuss how direction selectivity is influenced by lagged LGN cells and intercortical inhibition.

Demonstration von offline Reinforcement Learning auf einem eingebetteten System

Robin Gerstmann

Fri, 2. 6. 2023, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

In dieser Arbeit wird die Herausforderung von Offline Reinforcement Learning auf einem eingebetteten System demonstriert. Demnach wird in einer Simulation ein Offline Reinforcement Learning Algorithmus antrainiert und später auf die eingebetteten Plattform überführt. Die Aufgabe besteht darin, dass eine KI-gesteuerte Kugel (über Bluetooth), eine andere treffen soll. Demnach gibt es ein begrenztes Feld indem die Kugel agieren kann. Damit die Aufgabe für den Offline Reinforcement Learning Algorithmus herausfordernder ist, werden zwei Hindernisse mit eingebracht. Um die Ausführung auf den vorhanden realen Demonstrator zu gewährleisten, muss ein YOLO Version 7 trainiert werden. Dieser stellt die Informationen für Offline Reinforcement Learning Algorithmus bereit. Demnach soll die Machbarkeit und Verbund verschiedener KI-Algorithmen auf einem eingebetteten System demonstriert werden.

Generalization Ability of Crop State Classification Models in Harvesting Scenarios

Preethi Venugopal

Fri, 26. 5. 2023, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Neural networks have achieved remarkable progress in various fields of application. In the real-world scenario, it is of the greatest importance to estimate the capabilities of the deep learning models and algorithms to generalize and predict accurately the unseen data. Good generalization performance is the fundamental goal of any algorithm. Developing a generalized model is very critical because it has to predict accurately the data that is almost related to the training data. In this thesis work, the classifier's ability to generalize is measured for the FLP (Forward Looking Perception) system developed by John Deere for combined harvesters. ResNet architecture is chosen to determine and classify the states of the crop (down crop and standing crop) ahead of the machine. It is to be studied how well the classifiers for down crop detection generalize for various crops and regions. To determine the generalization ability of the model, dataset-based investigation and hyperparameter tuning methods are chosen. Various experiments are carried out on crops such as wheat and barley from different regions. The collected data contains images that are not directly fed into the neural network, instead, the images are broken down into patches and labeled. Labeled data is taken from the existing database. These patches are used for further training. Several basic data sets are prepared, in each one aspect of variation is fixed. For example, the crop type is fixed to wheat but the region varies. This creates data from each region. Also, combining two to three different regions will add more variety to the training. A series of training & testing on various data sets is performed and the generalization ability is evaluated using a confusion matrix and mainly by comparing the F-score for down crop. This thesis mainly investigates the generalization ability of such classifiers.

Perturbation Learning in the Context of Reservoir Computing

Max Werler

Tue, 16. 5. 2023, 1/375

Over the last decades, a variety of different training approaches for recurrent neural networks have been proposed along with slightly differing network architectures. One category of learning approaches is addressable under the name of perturbation learning. Instead of manually calculating the gradient of a network and backpropagating an error or reward signal, perturbation learning tries to approximate the gradient by adding random noise to the nodes and/or weights of the network and observing the change in the performance. Depending on that performance change, some degree of the noise will be committed permanently to the network. This presentation briefly introduces reservoir computing and takes a deeper look at recent studies regarding perturbation-based learning approaches. Related results as well as own experimental findings will be presented too. In the latter, Miconi's advance of node perturbation was compared with the weight perturbation strategy of Zuege et. al. by challenging accordingly implemented networks to solve the Delayed-Nonmatch-To-Sample task and control a simulated robot arm. The own findings align with the results of Zuege et. al. and demonstrate the ability of weight perturbation to keep up with and even outperform node perturbation.

Few Shot Learning for Material Classification at Drilling Machines

Rohina Roma Dutt

Thu, 4. 5. 2023, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

In this study, we investigate the application of few-shot learning (FSL) techniques for material classification in drilling machines, utilizing time series data. The lack of labeled data and difficulties in data preparation pose significant challenges, which were addressed by employing basic and advanced preprocessing techniques, along with a specialized data generator function. The research question focuses on identifying the optimal FSL approach for material classification in drilling machines. To this end, a range of models, including baseline models and Prototypical Networks, were evaluated. The experimental setup was conducted in an industrial environment for drilling operation, and the results showed that the hybrid model, a combination of 1DCNN and LSTM layers, outperformed other models. The Proto-Hybrid Model was compared to a baseline model based on the mean average precision (mAP) metric, which demonstrated the superiority of the proposed model. This work contributes to the field of FSL by highlighting the potential benefits of incorporating hybrid model structures in future research for material classification tasks. This research underscores the importance of exploring and comparing different FSL approaches to determine the best one for a given dataset. Further research should address limitations and explore alternative FSL methods to improve model performance in material classification for drilling machines and other industrial applications. The Protohybrid model provides a valuable contribution to the field of material classification, showing promising results and offering the potential for improving accuracy with more shots examples.

Evaluation and Implementation of 3D Container Problem Algorithms Based on Artificial Intelligence

Shubhangi Bhor

Fri, 17. 3. 2023, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

3D container packing optimization is one of the main problems in the logistics industry. Tons of packages or units need to be packed into the container to provide a smooth process for the production industry. It indirectly helps the environment by reducing the CO2 emission due to less number of containers on the road by optimizing container packing. Different bin packing algorithms help to solve this NP-hard problem, such as branch and bound, branch and cut, greedy algorithm, heuristic algorithm, and machine learning models. This master thesis helps to tackle a few of the problems, such as the reduction of the number of containers as well as costs by implementing and evaluating different artificial intelligence techniques.

Application Of Virtualization Concepts in Linux Distributed Industrial Controllers For Robotics

Suryakiran Suravarapu

Thu, 9. 3. 2023, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

The emergence of technologies such as IoT and Cloud has revolutionized the way data is stored, distributed, and processed. The development of applications has shifted from core legacy applications to portable and distributed microservices, bringing virtualization and containerization concepts to the forefront for building scalable mobile applications. Despite this, industrial robotics have not fully utilized these concepts due to constraints such as real-time requirements and application robustness, as they are limited to the control plant. Our research proposes an architecture that reduces deployment time and hardware resource usage by creating robust and portable ROS application packages as microservices through containerization and virtualization concepts. Additionally, we propose an orchestration tool to monitor and manage these containers while ensuring real-time constraints are met by running the ROS application packages on a real-time kernel. Furthermore, our research addresses seamless integration and deployment (CI/CD) in an offline (air-gapped) environment.

A comparison of machine learning and computer vision approaches for row crop segmentation

Kundan Kumar Agnuru

Fri, 24. 2. 2023, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Agriculture is one of the essential practices for survival and progress of humanity. Application of modern technologies in agricultural practices leads to analysis of plant characteristics increase in productivity. Row crop segmentation is one of the approaches in precision agriculture that helps in analyzing and evaluating features of plants and segment into regions based on characteristic similarities. Row crop segmentation helps in guidance of semi-automated or autonomous vehicles in between crop rows during spraying and harvesting applications. The purpose of this study is to evaluate different approaches based on unsupervised and computer vision techniques for row crop segmentation. It is accomplished by using Multi-spectral images as a data source. Multispectral images processed by application of filters to remove noise. Outcomes of filtering analyzed subjectively to evaluate for suitable filter. Extraction of region of interest and vegetation indices are calculated using denoised multispectral images. Out of calculated vegetation indices, the best vegetation indices stacked together and used as input for unsupervised and computer vision segmentation approaches. Evaluation of segmentation maps are done by applying line detection algorithm and comparing the results using Intersection over Union metric.

Modellierung der perisakkadischen Mislokalisation unter Berücksichtigung der visuellen Aufmerksamkeit

Nikolai Stocks

Wed, 22. 2. 2023, Room 309 and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Menschliche visuelle Wahrnehmung ist kein passiver Prozess, bei welchem Stimuli durch Sinnesorgane gemessen werden, sodass die Summe dieser Messungen unsere erlebte Umwelt erzeugt. Es handelt sich stattdessen um ein Zusammenspiel aus aktiven und passiven Prozessen. Antworten von Probanden in bestimmten Experimenten lassen wiederum Schlüsse auf die internen Dynamiken und Eigenschaften dieser Prozesse zu. Von besonderem Augenmerk sind hier Experimente, welche die Systematik von Fehllokalisationen untersuchen. Für diese Arbeit wurde das Modell von Ziesche and Hamker (2011) um ein an die Sakkade gekoppeltes räumliches Aufmerksamkeitssignal erweitert, um zu Versuchen den von Georg et al. (2008) beobachtenden Einfluss des Stimulus-Kontrasts auf perisakkadische Fehllokalisation zu replizieren. Des Weiteren wurde das Modell durch einen neu strukturierten zweistufigen Prozess der Integration und Auswertung von sensorischer Information erweitert, welcher sich in anderen ähnlichen Modeiterationen bereits bewährt hatte. Es zeigte sich, dass diese neue Quelle der visuellen Aufmerksamkeit in der Lage war, die Art des von Georg et al.(2008) beschriebenen Effektes zu replizieren, jedoch alleine nicht ausreicht, um das in Menschen beobachtete Verhalten vollständig zu erklären.

Improving industrial object recognition through fusion of physical properties

Maulik Jagtap

Thu, 16. 2. 2023, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Classification of industrial parts is a very challenging and time-consuming task. Because of this problem, every year, seven percent of industrial parts end up in waste which can be re-manufactured. The advent of machine learning makes it possible to classify industrial parts. However, since only an image is used to train the model, separating objects into their respective classes is difficult because all objects look similar. As every object has physical properties such as weight and height, this study aims to determine whether an object's physical properties fuse with the image as an extra feature will make a model more robust and improve accuracy for better classification. Based on the review of literature on the fusion of models. Multi-model fusion can help achieve the goal. Different modalities can be merged to create features. Differ- ent types of fusion include early fusion, late fusion, and hybrid fusion. The image extracts some individual features of the object, and physical properties are numerical, so various encoding techniques are used, such as standard deviation, one-hot encoding, and sin-cos encoding, to fuse the image and physical properties at classification time. Combined image and object physical properties improve model accuracy, and sin-cos encoding gives better results than other encoding methods.

Central Pattern Generators in robot control

Joris Liebermann

Thu, 9. 2. 2023, Room 1/367a and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Central Pattern Generators (CPGs) are neural circuits that play a crucial role in the coordination of movement in vertebrates. They allow for the generation of rhythmic and non-rhythmic movement patterns without the need for input from higher cortical areas and provide a mechanism for adaptive and recovery behaviors. The use of CPGs in robot control offers a biologically inspired approach to low-level control with the potential to generate a wide range of movement patterns. By extending mathematical models of CPGs, researchers have developed a Multi-layered Multi-pattern CPG (MLMP-CPG) architecture that can generate rhythmic and non-rhythmic patterns with only a few parameters. This MLMP-CPG model has shown promising results as a low-level controller for robots. It provides a flexible and adaptive solution to the problem of movement coordination in robotics which is demonstrated by examples of a humanoid robot in the context of walking and drawing.

A Neural-Gas inspired Spiking Neural Network

Lucas Schwarz

Thu, 9. 2. 2023, Room 1/367a and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Machine Learning models based on Teuvo Kohonens original work on Self-Organizing Maps and Learning Vector Quantizers still remain a credible alternative to commonly used Neural Networks in cases, where interpretability, decision stability as well as explainability are desired and constrained hardware is being used. In recent years, there have been investigations into utilizing alternative computational paradigms and hardware like Quantum Computing and Neuromorphic processors enabling the efficient use of Spiking Neural Networks (SNNs). To shed additional light into the capabilities of principles which guide SNNs, an adapted version of the unsupervised 'Neural Gas' algorithm has been developed, which is mainly used for lossy data compression as well as clustering due to its favourable responses to differences in data density. Using Spiking Neurons and Spike Timing Dependent Plasticity (STDP) requires a coding for data, which is usually available as float-valued vectors. The presented coding, based on previous work from Bohte et.al., transforms those vectors into spike delay times using spatially distributed fixed radial basis functions and is used to adapt the Neural Gas cost function, which generates an explicit learning rule, implicitly into the architecture of the developed model. First usage results and observations regarding stability and parallel developments are also considered in this presentation.

Self-Supervised Pretraining for Robotic Bin Picking by Using Image Sequences

Ralf Brosch

Wed, 1. 2. 2023, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Robotic bin picking in warehousing has the advantage of reducing laborous and monotonic work. Therefore, the object detection systems of the bin picking robots have to be reliable and accurate. However, oftentimes data has to be labeled manually, to train instance seg- mentation models and achieve high accuracy for object detection. To overcome the costly and time-consuming process of labelling, in this thesis three different Self-Supervised Learning (SSL) methods are presented, which make use of automatically generated labels. The labels are created according to the specific requirement of the pretext task. The first method uses warehouse data to classify images between different articles. The second method generates an instance label for every successful pick. And the third method uses the temporal aspect of the bin picking data to classify the ordering of images. All three proposed pretext tasks take advantage of the bin picking procedure itself to automatically generate the required labels. Each pretext task pretrains the ResNet-50 backbone of a Mask R-CNN instance segmentation model, to improve the instance segmentation in the downstream task. The goal of this procedure is to learn features in the backbone, which are useful for the downstream task. The results show that the first and third methods are not useful for pretraining an instance segmentation backbone. However, the second method manages to improve the backbone and increases performance by 7% for articles without manually labeled data. On the other side, it also decreases the performance by around 2.5% for articles, where manually labeled data is available. The findings in this thesis show that image sequences can successfully be used to generate labels for SSL models and that the pretraining of a Resnet-50 backbone with a proper pretext task can lead to a performance improvement in the downstream task of instance segmentation.

ECG-based Heartbeat Classification in the Neurosimulator ANNarchy

Abdul Kampli

Thu, 19. 1. 2023, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

ElectroCardioGram (ECG) measurements, which are frequently used to detect cardiac disorders due to their non-invasive nature, can be used to monitor heart function. Cardiologists with the right training can identify irregularities by visually reviewing recordings of the ECG signals. Arrhythmias, however, can be missed in routine check recordings because they happen sporadically, especially in the early phases. Spiking Neural Network (SNN) has a Dynamic Characteristics. Due of this, it is excellent at using dynamic processes. A cluster of integrate and fire (IF) neurons that have been supervisely trained to differentiate between 2 types of cardiac rhythms makes up the two stages Spiking Neural Network (SNN) architecture, which consists of a recurrent network or reservoir of spiking neurons that whose output is classified into different classes. To activate the recurrent SNN, we present a technique for encoding ECG signals into a stream of asynchronous digital events. We go through the issue of supervised learning in multi-layer spiking neural networks that can encode time. In order to train multi-layer networks of deterministic integrate and-fire neurons to execute non-linear computations on spatiotemporal spike patterns, we first design SuperSpike, a nonlinear voltage-based three factor learning rule. On the PhysioNet Arrhythmia Database given by the Massachusetts Institute of Technology and Beth Israel Hospital (MIT/BIH), we demonstrate an overall classification accuracy of 90-92%. A neural simulator created for distributed rate-coded or spiking neural networks, ANNarchy (Artificial Neural Networks Architect) has been used to develop the suggested system.

Embedded Neuromorphic Computing at the FZI Forschungszentrum Informatik

Brian Pachideh

Wed, 11. 1. 2023, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Neuromorphic engineering takes inspiration from the brain to develop novel processing methods and technologies that enable efficient machine intelligence; its key technology are Spiking Neural Networks (SNNs) for the processing of event-data streams. The field of neuromorphic engineering has recently seen a surge in research due to recent developments. One example is the emergence of ultra-low power neuromorphic hardware architectures for the acceleration of SNN, and another is the commercialization of the first neuromorphic sensor, the event-based camera, which serves as a native source of event-data streams. At the FZI Research Center for Information Technology, we are actively exploring the potential of neuromorphic engineering through a variety of publicly funded projects, our current application domains include smart health, smart city and automotive. Our research focuses on the implementation of event-based computing and communication hardware on FPGA, as well as their integration in embedded systems. This seminar serves as an introduction to the field of embedded neuromorphic engineering. It will cover the fundamentals of SNNs, give a technical overview of SNN hardware acceleration and event-cameras, provide open source tools for SNN development and finally highlight related projects at FZI.
The FZI is an independent foundation that specializes in the applied research of information and communication technologies. Its goal is to quickly transfer recent innovations into a wide range of applications. It does so with partners from industry, academia and the public sector. Located in Karlsruhe, Germany, one of its closest partners is the Karlsruhe Institute of Technology (KIT). At the FZI, Brian Pachideh works in the department Embedded Systems and Sensors Engineering (ESS).

Meta-Learning and Neuroevolution - part 2

Valentin Forch

Wed, 7. 12. 2022, Room 1/368a and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

To build neural network models, researchers rely on their intuitions, reasoning, or knowledge about what worked best for other people with similar problems. However, most often, it is not clear which exact model topology, hyperparameter set, or training schedule will give satisfactory results. This can lead to a laborious cycle of model adjustments and evaluations. Moreover, a good model may be too complex to be even considered by humans. The past decade has shown that doing away with expert knowledge and automating the process of knowledge extraction with deep learning can give tremendous results, e.g., when learning the game of Go without any human supervision. Considering the exponential growth of available computing power, it appears logical to forego the manual design of neural network models and instead use parallelizable optimization algorithms. The main classes of these meta-optimizers are meta-learning and evolutionary search. Both enable the optimization of virtually every part of a learning system, opening up new spaces for building more powerful and also biologically more plausible neural networks.

Deep Neural Networks and Implicit Representations for 3D Shape Deformation

Aida Farahani

Wed, 30. 11. 2022, Room 1/368a and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Geometric deep learning is a promising approach to bring the power of deep neural networks to 3D data. Explicit 3D representations such as meshes are not easily combined with neural networks as there is no unique mesh to represent a single geometry. Another explicit form, point clouds, as an unordered set of points sampled on the surface, can have a varying and often huge number of dimensions that limits their use as an input to a neural network. On the contrary, Implicit representations such as signed distance functions (SDF) define a 3D shape as a continuous function that could be approximated by a deep neural network. The continuous property of implicit representations causes the algorithm to be independent of the size and topology of the shapes. In this talk, I demonstrate how deep neural networks, along with implicit representations, can be used to precisely predict the deformation of a material after the application of a specific force. The model is trained using a set of custom finite element simulations to generalize to unseen forces.

Investigation of interpretability mechanisms for heterogeneous data using machine learning

Gerald Meier

Mon, 28. 11. 2022, Room B006 and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

In order of a wider acceptance of artificial intelligence methods, the powerful models developed over the last years needed to improve the transparency of their decision making (Adadi & Berrada, 2020). Due to the impact which biased data can have on the prediction made by a model, criticism of the opaqueness of the popular AI methods arose (Mattu, Angwin, Larson, & Lauren Kirchner, 2016). The term Explainable AI (XAI) was coined and has been a growing research field since. There are different approaches and XAI algorithms for different models and different kinds of data. The type of data most used and produced for business and commercial processes is structured data. It is formatted in tabular shape and the data it is composed of, is distinguished into categorical and numerical values. Unexpectedly, neural network models which do excel with unstructured data, did not perform complainingly well on structured data, with the data heterogeneity being a particular issue (Borisov et al., 2022). The thesis analyses how data heterogeneity, and other connected properties affect the selection of XAI methods. The most relevant properties besides heterogeneity are identified as a prerequisite and a foundation for the implementation of a prototype program. This program takes in a dataset and auxiliary parameters puts out a recommended XAI method.

Meta-Learning and Neuroevolution

Valentin Forch

Wed, 23. 11. 2022, Room 1/368a and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

To build neural network models, researchers rely on their intuitions, reasoning, or knowledge about what worked best for other people with similar problems. However, most often, it is not clear which exact model topology, hyperparameter set, or training schedule will give satisfactory results. This can lead to a laborious cycle of model adjustments and evaluations. Moreover, a good model may be too complex to be even considered by humans. The past decade has shown that doing away with expert knowledge and automating the process of knowledge extraction with deep learning can give tremendous results, e.g., when learning the game of Go without any human supervision. Considering the exponential growth of available computing power, it appears logical to forego the manual design of neural network models and instead use parallelizable optimization algorithms. The main classes of these meta-optimizers are meta-learning and evolutionary search. Both enable the optimization of virtually every part of a learning system, opening up new spaces for building more powerful and also biologically more plausible neural networks.

Detecting anomalies in system logs with a compact convolutional transformer - part 2

René Larisch

Wed, 9. 11. 2022, Room 1/368a and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Computer systems play an important role to ensure the correct functioning of critical systems such as train stations, power stations, emergency systems, and server infrastructures. To ensure the correct functioning and safety of these computer systems, the detection of abnormal system behavior is crucial. For that purpose, monitoring log data (mirroring the recent and current system status) are used very commonly. Because log data consists mainly of words and numbers, recent work used transformer-based networks to analyze the log data and predict anomalies. Despite their success in fields such as natural language processing and computer vision, the disadvantages of transformers are the huge amount of trainable parameters, leading to long training times. In this talk, I will present how a compact convolutional transformer can be used to detect anomalies in log data on two common log datasets from the supercomputers Blue Gene/L and Spirit. Using convolutional layers reduces the number of trainable parameters and enables the processing of many consecutive log lines. Our results demonstrate that the combination of convolutional processing and self-attention improves the performance for anomaly detection in comparison to other transformer-based approaches. At the beginning of my talk, I will give a short introduction to transformer networks before the presentation of the log data analysis.

Detecting anomalies in system logs with a compact convolutional transformer

René Larisch

Wed, 2. 11. 2022, Room 1/368a and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Computer systems play an important role to ensure the correct functioning of critical systems such as train stations, power stations, emergency systems, and server infrastructures. To ensure the correct functioning and safety of these computer systems, the detection of abnormal system behavior is crucial. For that purpose, monitoring log data (mirroring the recent and current system status) are used very commonly. Because log data consists mainly of words and numbers, recent work used transformer-based networks to analyze the log data and predict anomalies. Despite their success in fields such as natural language processing and computer vision, the disadvantages of transformers are the huge amount of trainable parameters, leading to long training times. In this talk, I will present how a compact convolutional transformer can be used to detect anomalies in log data on two common log datasets from the supercomputers Blue Gene/L and Spirit. Using convolutional layers reduces the number of trainable parameters and enables the processing of many consecutive log lines. Our results demonstrate that the combination of convolutional processing and self-attention improves the performance for anomaly detection in comparison to other transformer-based approaches. At the beginning of my talk, I will give a short introduction to transformer networks before the presentation of the log data analysis.

Deep Learning-based Fusion of Camera and Low-Resolution LiDAR data for 3D Object Detection

Jayashree Mohan

Thu, 20. 10. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

An autonomous vehicle requires an accurate perception of its surrounding environment to operate reliably. One of the sub-tasks for perception is 3D object detection, which is performed by processing the data from the sensors situated on the autonomous vehicle. Since each sensor has its advantages and disadvantages, sensor fusion could be a viable solution as it can improve perception by leveraging the advantages of different sensors. Hence, this master thesis focuses on 3D object detection through the fusion of data from camera and LiDAR sensors. One of the main contributions of this thesis is the analysis of the strengths and weaknesses of the existing 3D object detection approach 'PSR-PointNets'. It is built by combining two other 3D object detection architectures, namely Frustum PointNets, which processes LiDAR data, and MonoPSR, which processes image data. The input point clouds to Frustum PointNets sub-network are sparse because a low-resolution LiDAR sensor is used as it is cheap and suited to incorporate in commercial cars. Therefore, there is a need for the image information from MonoPSR to enhance the performance of the Frustum PointNets. Hence, this work's other contribution is investigating various fusion ideas to improve PSR-PointNets. Inspired by MVX-Net architecture, fusing the image features with LiDAR points is implemented where the MonoPSR's shared image feature maps are fused with the Frustum PointNets input LiDAR points. This fusion is performed to increase the accuracy of instance segmentation module in Frustum Pointnets. Similar to the fusion of image features, the image segmentation mask generated in the instance reconstruction module of MonoPSR is fused with LiDAR points, and the results are analyzed. A Simple Boolean Mask method was already implemented in the previous master thesis to translate the predicted point clouds to global coordinates. This fusion method enriches the objects with a few measured points with dense predicted points from image information. However, since this method suffers a few disadvantages, a neural network is built that predicts a translation vector to perform this fusion. Finally, a detailed analysis of both the translation methods is performed, and the results are analyzed.

Application of sliced ELLPACK format within the neural simulator ANNarchy

Qi Tang

Wed, 19. 10. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

The performance improvement of sparse matrix-vector multiplication (SpMV) in rate-coded networks in the neural simulator ANNarchy is a work that deserves our attention. This thesis explores this work from the perspective of sparse matrix storage format. We mainly study the sliced ELLPACK (SELL) format and its variants. We evaluate their performance in prototype tests and conduct a more detailed investigation of the thread configuration scheme of the SELL format. We integrated the SELL format of the selected best thread configuration into ANNarchy. In the integration, we propose to add a mask array to the SELL format to maintain the high performance of the SELL format and the requirement to distinguish zero elements from non-zero elements in ANNarchy. Our test results show that the SELL format can bring 10% to 60% performance improvement on the GPU side in ANNarchy compared to the SpMV performance of the original integrated ELLPACK-R format. In the test we chose the matrices used in the SELL literature.

Evaluation of the interchangeability of synthetic sensor data for object detection

Bhakti Govind Danve

Tue, 18. 10. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

In this thesis, the application and importance of synthetic sensor data in the field of virtual testing of Advanced Driver Assistance System (ADAS) is studied. As a part of this study, synthetic LiDAR sensor data is generated from two simulators: Car Learning to Act (CARLA) and Vector Informatik's DYNA4 Simulator. Furthermore, the ground truth annotation of traffic participants in these datasets is also generated. The datasets and annotations are then utilized for training Pointpillars deep-learning models in order to detect traffic participants. Usability of synthetic datasets is evaluated using a virtual ADAS functionality test setup. This is a closed-loop setup made up of DYNA4 simulator, LiDAR sensor model, Pointpillars object detector trained on LiDAR sensor data and an Adaptive Cruise Control Simulink model. Interchangeability of synthetic sensor data is evaluated by testing a CARLA dataset trained model on DYNA4 dataset and vice versa.

Verification and Evaluation of Shapley-based Explainable Artificial Intelligence for Reinforcement Learning

Sarah Nesner

Fri, 14. 10. 2022, https://eu01web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

In recent years Artificial Intelligence (AI) applications have increasingly become an everyday companion and it is hard to imagine life without them. Nevertheless, there are still many open research questions in Machine Learning (ML), the subfield of AI. What has not been answered for a long time in ML is the interpretability and the trustworthiness of the results using these techniques. Until now, the use of the ML method Reinforcement Learning (RL) in such interpretable Explainable Artificial Intelligence (XAI) models has remained almost completely unexplored. The primary task in this thesis is a detailed verification and evaluation survey of the model-agnostic approach SHapley Additive exPlanations (SHAP), a method of XAI, which is based on the so-called Shapley Values. They originate from the cooperative games theory and are defined as the average marginal contribution of the feature values over all coalitions [35]. After describing the theory on the topics of RL and SHAP, the methodology is presented with mainly 4 different experiments, which are distinguished by the number of their features. In the beginning a self-built threshold Expert-Policy experiment of the standard RL-domain Mountain-Car is introduced. Experiments in Visual Verification are conducted, whereas the observation variables are highlighted in the colours of their shap values. These plots as well as some waterfall-plots have provided a first clear indication that the method performs as expected. This is done in a deterministic and continuous action space setup. Thereafter experiments for a 4-dimensional observation space balancing Cart-Pole problem are performed. A Visual Evaluation and a Sensitivity Analysis are the experiments chosen here. A successive disturbance of state variables from 0% to 100% in three different modes is realized in the Sensitivity Analysis. It is shown that there is an accelerating decrease in performance with increasing the feature relevance (high shap-values), when there is a simultaneous increase in disturbance. Lastly, a real Industrial Benchmark with first 30 and than 150 features is used, to test the procedure with a high complex and stochastic behaviour and in a continuous state and action space. They are evaluated by an Expert- and a Neural-Network-Policy. Besides Visual Verification the major focus here is the research of different background datasets and sizes when applying the SHAP method. With the help of the Shapley-based XAI method and these verification approaches, a further step is made in the direction of reliability and interpretability. Besides gaining trust in the visualisations, a main conclusion of this thesis is that the background dataset has to be policy-based and the size should be at least half of the dataset length.

Event-based classification using unsupervised spiking neural networks

Lucien Berger

Wed, 12. 10. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

The ANNarchy neural simulation tool is used for neural network simulation. To prove its adaptability, we implemented and trained a spiking neural network on the event-based N-MNIST dataset by Orchard et al. (2015). This dataset consists of multiple samples of handwritten numbers. Based on the re-implementation of the Clopath et al. (2010) algorithm by Larisch et al. (2021), we proved that the neural network is able to learn in an unsupervised manner. The firing rate of the excitatory population combined with the label of the currently shown sample are fed into a support vector machine to measure the accuracy of the sample recognition. In the presentation we will have a look at the challenges we had to overcome to modify the input of the network, the network itself and also different approaches used to improve the accuracy of the support vector machine.

Goal selection in a hierarchical model of the multiple basal ganglia loops

Simon Schaal

Tue, 11. 10. 2022, Room 367 and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

The basal ganglia plays an important role in the action selection process. But how exactly do the cortico-basal ganglia-thalamo-cortical loops decide on one action? Initial ideas suggested that all loops propose an action independently and an arbitration mechanism selects the one that will be executed. Modern theories however propose that loops are hierarchically organized: every loop computes an intermediate objective for the next one and complex tasks get simplified until a movement can be executed. In previous work (Baladron et al., 2020, Eur. J. Neurosci.; Scholl et al., 2022, Brain Struct Funct), a hierarchical computational model consisting of two dorsal loops had shown promising results explaining behavior of rats in a reward devaluation task and the enhanced habitual behavior of Tourette patients. This model is here extended by a high-level ventral loop, implementing goal selection under the influence of episodic memories from the hippocampus. The hippocampus cycles through recent experiences, while the shell of the nucleus accumbens (NAcc) decides on which memory should be recreated. A selected state sequence then enters a working memory closed loop consisting of the medial prefrontal cortex, the thalamus and the NAcc core. Once a sequence is selected and stored, the dorsomedial loop, given a visual input, decides on a possible action that would recreate the desired state sequence. Finally, the movement is executed by the dorsolateral loop. The action selection capabilities of this model were tested on a 2-stage behavioral task (Daw et al., 2011, Neuron). The participants had to choose between 2 buttons, leading to one of two possible panels, each with two second stage options. In case of a sequence with an uncommon transition leading to a reward, a model-free reinforcement learning method would repeat the first action. In contrast, a model-based reinforcement learning approach would maximize the chance to make the second option available by choosing the other first choice. Experimental findings show that humans instead use a mixture of both strategies. The results of our model agree with the empirical findings by neither implementing a model-free nor model-based strategy. Instead, the probability of repeating the first option, if the transition was uncommon, is roughly the same regardless of the reward. It is also lower than the probability of staying at the same choice if the transition was common and the second option rewarded.

Few Shot Learning for Material Classification at Drilling Machines

Rohina Roma Dutt

Wed, 5. 10. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Artificial Intelligence for Ship Container Optimization and Warehouse Process Management

Shivani Piprade

Wed, 21. 9. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

The sea-freight containers get filled with packages coming from different suppliers to the Speyer consolidation center of Mercedes-Benz GmbH. Improving container utilization can help reduce logistic costs and efforts. Most importantly, it can con- tribute to reducing CO2 emissions. The presented thesis aims to research in the fields of artificial intelligence and reinforcement learning to pack the packages in the consolidation center effectively and orchestrate the warehouse process better. The problem of bin packing is researched and the state of the art for solution algorithms is studied. The minimum bin slacking algorithm is implemented and applied at the consolidation center to consolidate the packages such that more packages are accommodated. This eventually helps improve the fill rate of the container and manage the consolidation processes in a better way.

Data Augmentation of patterned 3D trajectories using generative neural networks

Janek Zangenberg

Tue, 20. 9. 2022, Room 1/273 and https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Data augmentation - the discipline concerned with applying suitable transformations to a given set of data, to obtain new synthetic samples resembling the original ones and thereby augment the data's value for the training of notoriously data hungry deep neural networks - is imperative to enable data-driven probabilistic agents the satisfactory tackling of tasks, for which meaningful data is hard to obtain. However, despite the ubiquity of time-series data and a myriad of processes depending on it, comparatively little research has been awarded to finding data augmentation techniques, one may resort to when wanting to augment training data of a serial nature - be it an univariate or multivariate one. This is partly due to the complex inter-temporal dynamics time series are subject to, which make them difficult to model. Thus, a substantial contrast to predominantly image-based data, for which in recent years various potent augmentation methods have been established, has to be acknowledged. To that end, the presented thesis examines the feasibility of augmenting 3-dimensional, discrete trajectory representations, carrying particular noise patterns and pertaining to fixed feature classes concerning their pathway. More specifically, sets of linear and circular trajectories of 594 samples each, demonstrated by the human hand and recorded by particular body motion tracking hardware provided by the company Wandelbots, in collaboration with whom the thesis was written, were considered. After undergoing a set of preprocessing operations, the ground truth samples were fed to a generative adversarial network and a variational autoencoder, which, hitherto, depict the flagship frameworks of deep generative modelling, the domain of deep learning which also subsumes the branch of data augmentation. Their respectively yielded results were then qualitatively and quantitatively compared both against each other, as well as to the ground truth, by means of a PCA-analysis, metrics describing the arrangement of consecutive data points, as well as the sample's visual appearances. Contrarily to the a priori assumption, the VAE proved to be up to the task to a much higher degree, resulting in a supremacy in all considered disciplines, as well as synthetic samples of a vast fidelity to the original data in general.

Anomaly Detection in Cyber Physical Systems using Compact Convolutional Transformer and Autoencoders

Md Deloawer Hossain

Thu, 11. 8. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Anomaly detection plays a crucial role in ascertaining the security of all Cyber-Physical Systems (CPS). In recent years, machine learning-based Artificial Intelligence(AI) system has been an essential part of detecting anomalies by analyzing the information, for instance, network traffic data, log data, sensors, and actuator values. Our research will mainly use Compact Convolutional Transformer (CCT) to detect anomalies in the Secure Water Treatment System(SWaT) testbed. The CCT is used for image classification and showed impressive performance, but we will use this algorithm for regression problems for the first time. We will use the CCT algorithm to predict the future values of the sensors and actuators and compare it with actual data to predict anomalies. Although, some other machine learning models, Long- Short Term Memory (LSTM) cell-based Deep Neural Network (DNN), One-Class SVM, and Autoencoders, showed good results while detecting anomalies from the SWaT system. We will see how this algorithm works with non-image data and whether it performs better than other machine learning models. We will also use Autoencoders to compare the result of the CCT. We will evaluate our results by Precision, Recall, and F1 score.

Integration of auto-tuning methods within the neuro-simulator ANNarchy

Badr-eddine Bouhlal

Tue, 9. 8. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

The ANNarchy neural simulation tool is used for neural network simulation on different hardware platforms [Vitay et al., 2015]. However, experiments have revealed that execution on the GPU is better for larger neural networks, while for minor networks, execution on a multiprocessor is more efficient [Dinkelbach et al., 2012, 2019]. The objective of the present work is to evaluate the possibility of applying an auto-tuning module within the ANNarchy neural stimulator. It will allow any ANNarchy user to automatically find the type of platform (CPU or GPU) and the type of configuration (if CPU, number of threads) necessary to get the highest possible performance. As a result, this work shows the design and development of a machine learning-based module that automates the prediction of the suitable configuration based on well-defined features.

A biologically plausible robotic arm controller using reward-guided reservoir computing

Maximilian Petzi

Fri, 5. 8. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

The goal of this thesis was to control a robotic arm with a biologically plausible model. The task was to learn to move the hand to a number of targets fixed before, where input determines the target to be reached. We used a movement generating model consisting of a reservoir of neurons at the edge of chaos (Miconi, 2017) that drives a (multi-layered multi-pattern) central pattern generator ((MLMP)CPG) (Nassour, Hénaff, Benouezdou, & Cheng, 2014) that produce a continuous movement. The reservoir's learning rule uses exploratory perturbations (noise) on the neuron activities to accumulate a potential weight change that is realized at the end of a trial (a trial consists of one attempt of the arm to reach the target), depending on the error signal given as the final distance of the hand from the target. The decrease of the error during initial learning is typically followed by an increase of the error that we will call collapse. This happens sooner and stronger for faster learning, while reaching only a higher error minimum. It could not be found that the dynamics become more chaotic over time so the reason for the collapse remains unknown. With the right hyperparameters, the the collapse can be avoided for the time of training and the network is able to learn to perform the task of reaching 8 different goals well. While analyzing the effects of parameter settings on the performance, we found that changing parameters in such a way that slower learning is expected reduces the minimal error significantly (for the number of goals being 8). Possible ways of reducing the speed and consequently the minimal error include low learning rate, noise amplitude and noise frequency.

Evolution of an activation function for a model of the primary visual cortex to compensate for missing input

Pascal Berger

Thu, 28. 7. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

In vision, there is an effect, where neurons of the primary visual cortex (V1) respond to visual stimulation, although if there is no input to their classical receptive field (RF), but just for the surround. This effect has been shown to be mediated by excitatory feedback originating from the secondary visual cortex (V2). I want to address in my bachelor thesis, what an activation function for neurons in V1, which integrates feedback from V2, as well as feedforward, lateral inhibition and excitation, could look like. Importantly, this activation function should enable stable dynamics in a network with recurrent excitatory connections, even in the face of Hebbian synaptic plasticity. To answer this question, I use an evolutionary approach, called Neuroevolution of Augmenting Topologies (NEAT), to find a suitable activation function for the neurons in V1. In contrast to other evolutionary approaches, NEAT is not just optimizing or just evolving the network, it does both.

Models of Multisensory Integration II

Valentin Forch

Thu, 30. 6. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

The brain needs to integrate a multitude of different pieces of information to form a coherent world model it can act upon. Thus, it needs to (a) discover the statistical structure of its inputs and (b) filter information depending on the current task. In this talk, I will show how models relying on Hebbian plasticity may learn the statistical structure of sensory inputs. Furthermore, I will discuss different mechanisms current models of sensory integration lack that could improve their fit to physiological data.

Verbindung künstlicher und biologisch plausibler Neuronaler Netzwerke: Ein beta-Variational Autoencoder als Vorverarbeitung für den Ventral Pathway

Bastian Schindler

Mon, 27. 6. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Die Aufgabenstellung dieser Bachelorarbeit war es eine Abänderung an dem Original Modell von Frederik Beuth vorzunehmen und künstliche Neuronale Netze mit biologisch plausiblen Netzen zu verbinden. Dafür sollte die grundlegende Form des Modell Beuth beibehalten werden, nur die einmalige Erstverarbeitung der Eingabedaten durch das V1 sollte durch ein Neuronales Netz erweitert werden. Da dass Modell Beuth eine Möglichkeit besitzt einzelne Eigenschaften auszuwerten, ist es also möglich ein Neuronales Netzwerk zu erstellen, welches mit wenigen Ausgabeneuronen, welche die Eigenschaften repräsentieren, das Modell anzutreiben. Hierfür müssen diese Ausgaben jedoch so vorliegen, dass das Modell in der Lage ist Eigenschaften dieser Ausgabe auf die Objekte anzuwenden.

Goal-directed visual search by a combined deep network with attentional selection

Chi Chen

Thu, 16. 6. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

In recent years, deep learning approaches have achieved excellent results both in the field of object recognition and object detection. Nevertheless, little information about the solution of the task by the human brain can be gained through the approach of the networks training with supervised learning. Based on the previous attention mechanism model by Beuth and Hamker (2015), the models variational autoencoder (VAE), variational autoencoder with classifier (VAE_clf), and autoencoder with contrastive learning (AE_ctr) are presented in this work to find a reliable compromise between the object representation and the high spatial resolution. A latent space from the encoder is used as input for the attention model. The object and background images for the verification tests are selected from the COIL-100 and BG-20k datasets respectively. This work aims to use the combination of a model in this study and the previous attention model to search for a given target in a natural scene with higher accuracy. It can be found that the model evaluations are not accurate as we expected. The accuracy tolerance of target prediction is 85% in object detection, model VAE_clf performs highly closer to the expected results, with 13 out of 20 objects, that have been correctly predicted with an accuracy over 85%. While this number is only 1 for model VAE and 4 for model AE_ctr, indicating a superior performance of model AE_ctr comparing to model VAE.

Simulation-based high-level decision-making for lane changing in the field of autonomous driving using reinforcement learning

Chaithra Varambally

Wed, 15. 6. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Autonomous driving is redefining the role of automobiles. A crucial part of research and development of autonomous driving is taking place on lane change decision-making as it is one of the modules responsible for maneuvering the automobile and preventing collisions. The research performed so far in the area of autonomous vehicle lane changing was mainly on lane change path planning and path tracking, but autonomous vehicle lane change decision-making is researched on to a lesser extent. This project focuses on enabling lane change decision-making of an autonomous vehicle using Deep Reinforcement Learning in a symmetric highway traffic. An open source package called highway-env is used as the simulator. This simulator provides an environment in which the ego vehicle maneuvers through other vehicles along a linear stretch of 4-lane highway traffic. The state space is built on a continuous scale while the action space on a discrete scale. The agent learns using Reinforcement Learning algorithms - Deep Q-networks, a value-based learning method, and Proximal Policy Optimization Networks, a policy gradient method. The goal of the experimentation is to attain a low collision percentage while also achieving high speeds. Extensive training of the agent using the two algorithms result in an efficient yet smooth lane change decision-making policy. The ability of the proposed model to perform lane change maneuver with safe and smooth trajectories on a timely basis is illustrated through experiments.

Visual artefacts on motion-compensated ground plane in surround view system

Vidhiben Patel

Mon, 30. 5. 2022, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

The Advanced Driver Assistance System (ADAS) provides services and features for the drivers to improve their driving experience. These features include Surround View Systems that provide the drivers with a full view of the surroundings of their vehicle using the camera that is fixed at four sides of the car. One of the features of the Surround View system is the See Through Bonnet (STB) feature that facilitates the users to see the underneath part of the car while driving. This feature makes use of a time-based temporal ground plane built using the image frames from the cameras that are fixed on the car. During the creation of these ground planes, a lot of artefacts are created due to natural light features that exist in the environment. The work presented in this thesis concentrates on the analysis and resolution of resolving two main artefacts on the ground plane: Sawtooth shadow patterns on the ground plane due to the partial car ego-shadow and image harmonization-related defects due to different lighting conditions in the environment. The algorithms presented are described in detail and their results are analyzed using visual perception, metric-based, and efficiency analysis.

Self-supervised learning of transferable visual representations for urban scenes

Devi Sasikumar

Wed, 25. 5. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

While high-resolution labeled panoramic images in autonomous driving applications lend themselves well to 'data hungry' deep learning algorithms, it is challenging to obtain exhaustive annotations on these images. Self-supervised methods aim to replace the ubiquitous labeling-intensive paradigm associated with traditional supervised deep learning. In this study, the transfer performance of six self-supervised pre-trained models on downstream semantic segmentation tasks is compared with the supervised baseline. Experimental results prove that three out of six self-supervised models showed an improvement of up to 5% in IoU value as compared to the supervised baseline for downstream segmentation task. Also, the study aims to address the crucial question: Can self-supervised features pre-trained on curated ImageNet dataset generalize well to the diverse downstream tasks with uncurated datasets? To answer this question, the effectiveness of domain-specific self-supervised contrastive learning as a pre-training strategy was investigated. This approach of leveraging unlabeled data for learning generic representations can be potentially employed for other applications like medical or agricultural where annotation budget is limited or a large amount of unlabeled image data is available.

An investigation of generative deep learning models for visualizing neural activity

Ralf Burkhardt

Thu, 12. 5. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

As of today, many neurological systems are often perceived as a mysterious 'black box' with little to no detailed insight into them. This is true for (simulated) biological and for derived artificial systems like neural networks. Gaining comprehensive insight into these systems could help to better understand them and even enable further innovations in fields like AI and robotics. For better insights into visual systems, either biological or artificial, a series of promising deep learning techniques exist to recreate presented visual stimuli from corresponding neural activity data. This way it should become possible to directly visualize the extracted neural activity data and observe how different image features are represented in said data. However, a direct comparison as well as an objective, quantitative performance measurement of all promising techniques has not yet been done. This work aims to provide such a comprehensive overview. Additionally, the exemplary usage of a suited reconstruction method will be demonstrated to showcase how a given neurological system can be closer examined.

Models of Multisensory Integration

Valentin Forch

Thu, 28. 4. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

The brain needs to integrate a multitude of different pieces of information to form a coherent world model it can act upon. Thus, it needs to (a) discover the statistical structure of its inputs and (b) filter information depending on the current task. Posterior parietal cortex has been implied to underly many related cognitive functions: integrating information from different senses, representing spatial relations, working memory, as well as decision-making. It is a highly convergent brain area receiving input from visual, auditory, somatosensory, motor, and frontal cortices. Classical receptive field studies have shown that parietal neurons integrate information from multiple senses and neurocomputational models have shown that neural computation could, in principle, underly optimal integration of information. In this talk, I will review the current 'state of the art' of such models and discuss some of their shortcomings and respective strengths for eliciting how the brain builds a useful world model.

Retinal surround delay leads to spatiotemporal behavior in the early visual pathway

René Larisch

Thu, 14. 4. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Visual information of the world surrounding us is not static but changes constantly. Also the spatial receptive fields along the early visual pathway - from the retinal ganglion cells (RGC) over the lateral geniculate nucleus (LGN) to the primary visual cortex (V1) - are not fixed, but have shown a change of their structure during a short time interval without further synaptic plasticity. These temporal behavior adds the time dimension to the 2D spatial receptive field and leads to spatiotemporal receptive fields (STRF), suggesting a processing for time relevant visual stimuli, like the direction movement of an object. Since their first description, a vast amount of models have been published to simulate the temporal behavior of V1 simple cells. Most of these models use a non-linear function to model the change of the spatial receptive field over time. As good as these models are in replicating this behavior, they do not explain how STRFs emerge on a more neuronal level. In this talk I will show how a delay in the surround field of the RGCs, an assumption reported previously, leads to STRFs in the LGN and the V1/L4 simple cells. Further, I will discuss how the direction selectivity (a characteristic strongly connected with STRFs) of V1 simple cells is influenced by so called lagged LGN cells and inhibition in V1/L4.

Offline reinforcement learning

Vincent Kühn

Fri, 11. 3. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Reinforcement learning algorithms require lots of data to learn optimal policies. They are fundamentally gathering data online while interacting with the environment, processing it iteratively to update their policy and repeat the process for a large number of steps. One of the reasons this works so well is because the data collection is done during learning. In many scenarios, this type of online interaction is rather impractical because data can be very sparse, expensive, and even dangerous to collect (e.g., healthcare, autonomous driving, and robotics). Offline reinforcement learning tries to deal with these problems by only utilizing previously collected data without any further online interaction with the environment. This approach comes with its own and new set of challenges that must be solved. In this seminar, I will talk about the two types of offline reinforcement learning, namely model-based and model-free, what the new challenges are and how different algorithms try to overcome them.

Near-Real-Time Yield Forecasting with Remote sensing imagery via Machine Learning

Yashaswi Dhiman

Thu, 10. 3. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Agriculture is the primary component of crop production and food security fulfill- ment. The nations require accurate crop yield forecast to make judgments on food production and supply rate for stable food supply and security. If we provide the farmer the ability to analyze the high and low yield zones in his field at different time slots during the season, the farmer may alter his farming techniques based on the estimates to produce a higher yield at the end of the season and earn more revenue. The purpose of this study is to forecast yield near-real-time for the current season, prior to harvest, at certain time intervals such as 20, 40, or 60 days before harvest. It accomplishes this by creating and forecasting spatial yield maps based on attributes collected from remote sensing data. From 2016 through 2021, harvest yield statistics for 164 European winter barley and 303 European winter wheat are available. Whereas the average sowing for both crops begins at the end of September and the average harvesting occurs at the beginning of August. As a result, each of them have a one-year crop season. The harvests are interpolated to provide cleaned yield maps with a resolution of 10 meters. Remote sensing imagery is retrieved from the Sentinel-2 satellite of the L1C layer for improved spatial consistency, and is collected between 20 and 80 days prior to harvest at a spatial resolution of 10 meters and a temporal resolution of 5 days. For improved crop health estimate, the suggested technique makes use of extracted vegetation indices characteristics from Sentinel data. CNN is a deep learning modeling technique with 7 convolution layers and batch normalization with around 6.5 million trainable parameters to forecast yield map by conducting pixel-wise regression job. The CNN model is trained on 3 scenarios based on time periods and different crop growth stage. The solution has been realized by predicting for the most recent period while training on sev- eral previous periods. The result is reliable and with around 87% accuracy and 1.35ton/ha RMSE. For further usage, one can just plug in the latest sentinel image and predict on the basis of saved model. The forecasted yield maps are accurate and aesthetically consistent with the target yield maps.

Segmentation of Swing Movements in the Golf Sport from Foot Plantar Pressure Data using Machine Learning

Sachin Bainur

Thu, 10. 2. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

In any sport, athletes can use feedback systems to improve their training quality. Athlete training systems have recently allocated interest in developing frameworks capable of taking physiologic and behavioral parameters and processing them for a specific sport. Monitoring, assessment, and improvement suggestion are vital factors of such training systems. Golf is a challenging sport that demands complex actions to swing the golf club in the proper direction. Golf swing segmentation is critical in studying and developing feedback systems for golfers using wearable technology. It is crucial to break the swing into smaller segments (e.g., setup, backswing, downswing) to analyze further and give feedback about each of them individually. However, no attempt has been made to leverage the player's foot plantar pressure information and employ machine learning models to estimate and divide golf swing phases. In this research, a thorough examination of various machine learning models is carried out to narrow this gap in the literature by using a foot pressure insole FScan. The insole used in this study is of high resolution consisting of 960 pressure sensors per foot. Also, the research focuses on determining the foot regions with the highest impact in swing key position (SKP) detection to design a low-resolution pressure insole in further studies.

The effect of more detailed biological features in cerebellar-like models on sensor guided robots - an overview

Manuel Rosenau

Wed, 2. 2. 2022, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

The development of humanoid robots is closely linked to the understanding of the human body and vice versa. The growing knowledge about the human brain enables us to design better computational models, and biologically inspired control structures allow us to test the underlying theories. Enhanced accuracy and precision on the part of the robot and a better understanding of the human brain are the benefits. One of the key structures in voluntary movements, coordination and posture, as well as motor learning, is the cerebellum, whose basic operating principle is information comparison of the cerebellar cortex and the deep cerebellar nuclei. This talk will give an overview over the structure and functioning of the cerebellum and the role of the forward model as predictor. Based on this, different approaches will be presented to show how more biological inspired computational models can improve the motion accuracy of a robot.

The basal ganglia and the role of cortico-basal ganglia-thalamo-cortical loops

Fred Hamker

Wed, 5. 1. 2022, https://hu-berlin.zoom.us/j/61917824698?pwd=cXNWd0hScnpLM2ovYXF0bFpSZCtHQT09

My lecture provides an introduction to a brain structure that is heavily connected to the cortex: the basal ganglia. I first provide anatomical and physiological background of the basal ganglia that motivates its importance when one wants to understand brain function. I will then provide examples of neuro-computational models that aim at better understanding the function of the different basal ganglia pathways: direct, indirect and hyperdirect, and briefly mention the recently discovered arkypallidal neurons. Next, I will address the involvement of the basal ganglia in cortical processing, such as working memory, consciousness, categorisation and the learning of habits by means of cortico-basal ganglia-thalamo-cortical loops, again illustrated by means of neuro-computational models. I conclude with a brief outline of the neurosimulator ANNArchy that has been used as a tool for model simulations.

Attentional neural networks

Julien Vitay

Wed, 8. 12. 2021, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Attentional neural networks such as the Transformer architecture and its variants (BERT, GPT) have revolutionized Natural Language Processing (NLP) since 2017 by allowing self-supervised learning. Recently, Transformer architectures were also applied in other domains, such as computer vision, time series processing and geometric deep learning, with great success. In particular, Vision Transformers achieve a better accuracy than state-of-the-art CNNs on ImageNet while requiring less parameters. The moment has come to throw away the outdated CNNs / LSTMs / VAEs / GANs and focus on Transformers for virtually any problem. This lecture will explain the basic principles of self-attention and the Transformer architecture.

Optimization of machine learning models for Network Traffic Classification in 5G networks

Anjali Agarwal

Wed, 8. 12. 2021, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Machine Learning (ML) methods have become very popular in various fields of technology and science. This is due to the ML algorithm's ability to learn rules, allowing the system to produce efficient results without being explicitly programmed. Network Traffic Classification using ML techniques has become a popular approach, where ML models are trained to classify unknown Network Traffic applications. This classification is done on real-time Network Traffic and is known as inference. With the emergence of 5G mobiles networks leading to enhanced data rates and lower latency, Network Traffic communication will become very fast. For these reasons, inference on Network Traffic using ML methods need to become faster as well. This outlines the need to create ML architectures that are smaller, faster and efficient according to the increased requirements of Network Traffic Classification in 5G mobile networks. The thesis focuses on investigating three optimization techniques to create ML models that can work efficiently on the classification of Network Traffic. The first two techniques - Network Pruning and Quantization, focus on optimization of existing network architectures. The third technique, Neural Architecture Search, focuses on finding an efficient network architecture from a set of different architectures. Within the scope of the thesis, these techniques are implemented using a state-of-the-art implementation framework, and the performance results achieved are used to estimate the quality of the optimization performed. The aim of this implementation is to investigate if there is a possibility of reducing the ML model's size and increasing the speed without much impact on the accuracy of the model. This investigation is conducted for each optimization technique mentioned, and the results of the implementation are drawn. The advantages and disadvantages of each technique are also discussed briefly to determine the usability of the methods.

Multimodal Sensor Fusion with Object Detection Networks for Automated Driving

Enrico Schröder

Thu, 18. 11. 2021, https://webroom.hrz.tu-chemnitz.de/gl/and-rrc-wiu-26u Zugangscode: 554534

Object detection is one of the key tasks of environment perception for highly automated vehicles. To achieve a high level of performance and fault tolerance, automated vehicles are equipped with an array of different sensors to observe their environment. Perception systems for automated vehicles usually rely on Bayesian fusion methods to combine information from different sensors late in the perception pipeline in a highly abstract, low-dimensional representation. Newer research on deep learning object detection proposes fusion of information in higher-dimensional space directly in the convolutional neural networks to significantly increase performance. However, the resulting deep learning architectures violate key non-functional requirements of a real-world safety-critical perception system for a series-production vehicle, notably modularity, fault tolerance and traceability. This dissertation presents a modular multimodal perception architecture for detecting objects using camera, lidar and radar data that is entirely based on deep learning and that was designed to respect above requirements. The presented method is applicable to any region-based, two-stage object detection architecture (such as Faster R-CNN by Ren et al.). Information is fused in the high-dimensional feature space of a convolutional neural network. The feature map of a convolutional neural network is shown to be a suitable representation in which to fuse multimodal sensor data and to be a suitable interface to combine different parts of object detection networks in a modular fashion. The implementation centers around a novel neural network architecture that learns a transformation of feature maps from one sensor modality and input space to another and can thereby map feature representations into a common feature space. It is shown how transformed feature maps from different sensors can be fused in this common feature space to increase object detection performance by up to 10% compared to the unimodal baseline networks. Feature extraction front ends of the architecture are interchangeable and different sensor modalities can be integrated with little additional training effort. Variants of the presented method are able to predict object distance from monocular camera images and detect objects from radar data. Results are verified using a large labeled, multimodal automotive dataset created during the course of this dissertation. The processing pipeline and methodology for creating this dataset along with detailed statistics are presented as well.

The Cerebellum - Comparator and Predictor of Movement

Julian Thukral

Wed, 10. 11. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

The main role of the cerebellum is believed to be one of a predictor and comparator. In the context of motor control, it implements a forward model to predict the future state of a plant/effector given the current state and a motor command. Furthermore, it compares this predicted state to the actual obtained state, generating a sensory prediction error with a wide range of uses in motor control, motor learning and as a key component in generating a Sense of Agency. Sense of Agency describes the experience of controlling our own actions and through them events in the outside world. Usually, we do not question if we are the agent of our own actions, it only comes into focus by the element of surprise when there is an incongruence of intention and action outcome. In this talk I will give a brief introduction into the workings of the cerebellum, the forward model of motor control and reservoir computing before presenting models. The first one a neuro-computational model of the cerebellum using an inhibitory reservoir architecture and biologically plausible learning mechanisms based on perturbation learning. The second one a reservoir computing model of the cerebellum built to study the Sense of Agency in the context of motor control.

Sequentielle Bildklassifikation von Kollisionsflugbahnen anhand visuell-taktiler Assoziationen: Implementierung eines aufmerksamkeitsgesteuerten CNNs

Cristian Guarachi Ibanez

Wed, 3. 11. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

Heutzutage ist in der Robotik die sichere Interaktion mit dem Menschen ein zentrales Thema für zukünftige Assistenzmaschinen. Daher werden computerbasierte Lernalgorithmen erforderlich, welche es den Robotern ermöglichen, eine Form von 'Ganzkörper-' und 'Nahraum'-Bewusstsein zu erwerben. Traditionell wurden Roboter anhand eines Hindernisvermeidungssystems trainiert, sodass sichere Trajektorien für den Endeffektor berechnet werden konnten. Hierbei verlässt sich die Robotersteuerung eher weitgehend auf vorprogrammierte Modelle und 'blind' ausgeführte Endeffektoren von Trajektorien als auf den physischen Körperkontakt zu der Umgebung. Jedoch zeigt eine Reihe von Studien, dass Roboter die kontinuierliche Erkennung von Objekten in deren unmittlebarem bzw. peripersonalem Raum erlernen und sich dadurch dynamisch an unvorhersehbare Interaktionen mit dem Umfeld anpassen können (Roncone et al., 2016, 2015). Hierbei ist ein zentraler Überwachungsmechanismus hervorzuheben, welcher trainiert werden kann, um den peripersonalen Raum mit Hilfe verschiedener sensorischer Modalitäten zu überwachen. Für diese Arbeit ist die visuelle Überwachung peripersonalen Raums von Interesse, um mögliche Kollisionen zu erkennen. Die vorliegende Arbeit beschäftigt sich mit dem Erlernen peripersonaler Raumrepräsentation mit Hilfe einer Reihe von chronologisch-angeordneten Bildern. Im Zuge dessen wird hierbei der Frage nachgegangen, ob sich ein Klassifikationsalgorithmus in Form eines tiefen neuronalen Netzes für die visuelle Kollisionserkennung aus Bildsequenzen in einer 3D-Umgebung adaptieren und anwenden lässt. Als Grundlage hierfür wird das EDRAM-Netzwerk von Ablavatski et al., (2017; Forch et al., 2020) herangezogen. Dieses Netzwerk zeichnet sich durch einen trainierbaren räumlichen Aufmerksamkeitsmechanismus aus, welcher auf die Objekterkennung angewandt werden kann und mit verschiedenen Modellen des selektiven Aufmerksamkeitsfokus in der menschlichen visuellen Aufmerksamkeit vergleichbar ist. Um die oben gestellte Frage zu beantworten, wurde zuerst ein Datensatz erstellt, der sich aus Bildsequenzen der Objekttrajektorie in der Simulator-Umgebung zusammensetzt. Anschließend wurde das tEDRAM zunächst mit der Informationen aus Bildsequenzen, die von einer Szene-Kamera im iCub-Simulator erfasst wurden, trainiert. In erster Linie beschäftigt sich diese Arbeit mit den vorläufigen Ergebnissen aus dem Training mit den Szene-Bildern. Hierbei wird festgestellt, dass das tEDRAM nach dem Training beschränkt lernt, Kollisionen in den Bildsequenzen zu erkennen. Die Gründe für diese Ergebnisse werden in der Arbeit diskutiert und mögliche Erklärungen vorgeschlagen, die beim Training mit den aus Augenbildern entstandenen Disparity-Maps berücksichtigt werden sollten.

Motor adaptation: paradigms and models

Javier Baladron

Wed, 27. 10. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

A football player is able to shoot a goal even if it is raining. A tennis player is able to return a ball under multiple wind conditions. These are examples of motor adaptation, a process by which actions are adjusted in response to a change in the environment. In this talk I will first introduce the paradigms used to test motor adaptation in a laboratory and show a set of phenomena that have been observed in multiple experiments. I will finally describe some computational models that attempt to explain the brain mechanisms involved.

Cognitive Learning Agents - Preconditions for solving Cognitive Tasks

Alex Schwarz

Wed, 20. 10. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

Even though computational models of visual perception and human behavior have developed a lot in recent years, the human ability to adapt to new contexts and stimuli is still unmatched. A computational model might be able to classify digits with high precision, whilst another one can tell cats and dogs apart, but when confronted with the data of the other they present nonsensical results with high levels of confidence. Deep Models often rely on vast amounts of data whilst humans grasp the concept of different task in seconds. Current approaches of cognitive computational architectures like ACT-R depend on the symbolic representation of stimuli, performed operations and task descriptions. In this talk I will present a concept of which structural and functional preconditions must be met in order to be able to solve a variety of tasks without the need for explicit instructions. I will present approaches that have been shown to be promising, on their own, in tasks of adaptability and task switching. Further I will present which elements of a learning cognitive system are key to achieving realistic behavior and which problems currently remain to be solved in the combination of models.

Genetic Algorithms to Optimize Unsupervised Learning Rules for Neural Networks

Shaopeng Zhu

Wed, 13. 10. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

Genetic algorithms solve optimization problems by using natural evolutionary mechanisms. Through the selection of individuals in each generation and the crossover and mutation of genes, the approximate optimization results can be obtained in the specific problem domain, but the details of the model need not be designed in advance. The basic processes and relative details of genetic algorithms, including coding, selection, crossover, and mutation, will be introduced. Genetic algorithms have performed well in some supervised and unsupervised learning. Lin Wang and Jeff Orchard have used experiments to observe the performance of genetic algorithms in optimizing the learning rules of neural networks. In their experiments, the neural network learning weights and biases are updated through a synaptic plasticity network. I will introduce the details of their experiments, the analysis of their results, and some interesting phenomena that they found.

Smart suction cup cluster handling using sim2real reinforcement learning

Georg Winkler

Wed, 6. 10. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

The use of compressed air is one of the driving forces of energy consumption in industrial processes. However, in many of those processes, compressed air is indispensable. An exemplary case uses compressed air to operate vacuum-based grippers for moving automobile body parts that can not be clamped. The core subject of this thesis is to find a more efficient strategy to operate industrial suctions cup clusters based on Reinforcement Learning. Reinforcement Learning methods achieve notable successes in solving board and video games, but by now often lack to be transferred to real-world applications. In this thesis, a real-world testing rig is constructed to model an industrial suction cup cluster. The behavior of the suctions cups is transferred to a simulation solely based on the measured pressure values inside the different suction cups. This simulation is used to train agents from two Reinforcement Learning domains to operate in the simulated environment efficiently. The first batch of agents is based on Q-Learning combined with models from Statistical Learning. The second batch uses the more recent but more complex Deep-Q-Network approach combined with Artificial Neural Networks. This work shows that the models based on Statistical Learning can control not only the simulated but also the real-world suction cup cluster. Compared to a conventional sucking strategy based on a threshold, especially the models based on the Elastic Net algorithm need fewer sucks per hour to hold a workpiece and thus operate more energy efficiently. On the other hand, agents based on the Deep-Q-Network framework and trained under the same conditions are less successful in the simulations and fail to control the suction cup cluster in real-world tests.

Development of a Machine Learning model to optimize holistic resource planning of both production planning and building automation

Aby Tom

Wed, 29. 9. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

Production and building systems are two significant consumers of energy in the manufacturing industry. The complex thermal interactions between these two and their scheduling need to be appropriately addressed to improve energy efficiency. Conventional methods include a rule-based controller or a classic model predictive controller, which demands considerable modelling. Hence, this thesis investigates the state-of-the-art machine learning methods that can model and optimize production to reduce energy consumption. The production environment is simulated using MATLAB Simulink software which takes a holistic approach that accounts for the outside weather conditions, thermal interaction between machines, heating-cooling system, building walls and workers inside the production plant. First, a multi-step supervised learning model is developed to imitate the model predictive controller's behaviour. Then an investigation on the development of an end-to-end reinforcement learning model using the Deep Q-Network is conducted. Two model variants were developed: one variant with fully connected layers (Model A) and another with a Long Short-Term Memory layer (Model B). Models are evaluated based on their energy consumption and temperature control ability. Model B maintains the desired temperature levels inside the production plant and achieves the manufacturing objectives outperforming the model predictive controller. This thesis's developed machine learning approach can be applied to similar optimization problems to improve energy efficiency and reduce modelling effort.

Untersuchung der Datenauswertung mit der Zielsetzung der Datenaufbereitung für KI-basierte Roboterregelung

Tina Abdolmohammadi

Fri, 24. 9. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

Ziel dieser Arbeit ist es Grundlagen zu schaffen, um die Anpassungsfähigkeit eines robotergestützten Prozesses für individuelle Bauteile mit Hilfe einer KI-Regelung zu steigern. Die Entwicklung geschieht beispielhaft für roboterbasiertes Rollprofilieren. Konkret wurden hierfür während des Prozesses die Reaktionskräfte und -momente mithilfe einer Kraftmessdose am Roboterflansch gemessen. Zur Bewertung der erreichten Geometrie nach der Umformung wurde diese jeweils mit einem handgeführten Laserlinienscanner vermessen. Die Datenaufbereitung und -auswertung wurde erarbeitet und analysiert. Es wurden weiterhin unterschiedliche Algorithmen, im speziellen lineare Regression, exponentielle Regression, multiple Regression, Bayes'sche Regression, polynomiale Regression und neuronale Netze angewendet, um zu überprüfen, welcher Algorithmus die Daten am Besten widerspiegeln kann. Zusätzlich wurde auch parallel zu diesen regelungsseitigen Entwicklungen untersucht, wie eine Korrektur nach der Umformung, ohne KI, die erreichte Geometrie aus umformtechnischer Sicht beeinflussen kann. Dazu wurden die Abweichungen zur Sollgeometrie nach der Umformung berechnet und es wurde versucht diese in einem zusätzlichen Umformschritt zu minimieren. Es wurde festgestellt, dass Korrekturfahrten grundsätzlich möglich sind und die Ergebnisse deutlich verbessern können. Es wurde jedoch weiterhin eine gewisse Welligkeit festgestellt, die nicht mit einfachen Regelkreisen zu kompensieren ist. Die in der gegenwärtigen Arbeit entwickelten Grundlagen sollen nun im Rahmen von Folgearbeiten für die Implementierung komplexerer Modelle in den Regelkreis genutzt werden, um so die Prozessergebnisse KI-gestützt optimieren zu können.

Deep-Learning in a reduced system of the visual cortex for object localization

Natasha Szejer

Wed, 22. 9. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

Object localization is an important task on its own, but also to improve the performance of computer vision methods, with applications in e.g. robotics or object detection. Different approaches for this task exist: Inspired by neuroscience and the human visual cortex, the System-level of Attention (SLA) model has been presented by Beuth and Hamker. This model uses an attention mechanism originating in the Frontal Eye Field (FEF) and Prefrontal Cortex (PFC) to control a Higher Visual Area (HVA). Because the HVA controls the FEF, the process is recurrent and has shown to be promising for object localization. On the other hand, recently evolved deep learning methods, that can be optimized end-to-end to fulfill a given task have shown to outperform previous approaches. In this thesis, a deep learning architecture is proposed that has similar structure as the model developed by Beuth and Hamker, and is dimensioned similarly for a fair comparison. Additionally, the state-of-the-art You Only Look Once (YOLO) deep learning method for object localization is adapted to fulfill the same task as a performance baseline. A systematic evaluation shows that significantly better performance can be achieved with the deep learning-based methods, with the SLA-based deep learning method showing superior robustness than the YOLO-based method.

How the basal ganglia could contribute to optimal planning of the attentional template: A neurocomputational study

Erik Syniawa

Wed, 22. 9. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

Wenn wir auf der Suche nach einem bestimmten Objekt in unserer Umgebung sind, wird diese Suche von einer internen Repräsentation geleitet, die die Merkmale dieses Objektes abbilden. Diese interne Repräsentation nennt man attentional template (Duncan & Humphreys, 1989; Carlisle, Arita, Pardo, & Woodman, 2011; Geng & Witkowski, 2019). Es wird angenommen, dass die attentional template durch eine neuronale gain-Modulation die Aktivität von jenen Neuronen erhöht, die für die Kodierung relevanter Merkmale des Objektes verantwortlich sind (Maunsell & Treue, 2006; Reynolds & Heeger, 2009; Treue & Martinez Trujillo, 1999). Dies können zum Beispiel Farbtöne, Orientierungen oder Formen sein. Dabei verursacht diese Aufmerksamkeitsmodulation eine Vergrößerung des Signal-Rausch-Verhältnis (kurz SNR) zwischen relevanten Reizen (Targets) und irrelevanten Reizen (Distraktoren) und ermöglicht somit eine effiziente Auswahl eines Objektes (Peltier & Becker, 2016). Wird allerdings ein Zielreiz in einem Kontext dargeboten, wo irrelevante Reize mit den ähnlichen Merkmalen versehen sind, kann diese effiziente Auswahl gestört werden, da die Aufmerksamkeitsmodulation auch die Merkmale der irrelevanten Reize verstärkt (Navalpakkam & Itti, 2007). In solchen Fällen kann es von Vorteil sein, die Auswahl von den Neuronen leiten zu lassen, die die Merkmale kodieren, welche sich stärker von den Distraktoren unterscheiden. Dies wäre eine optimale Aufmerksamkeitsmodulation (engl. optimal tuning), die die SNR zwischen Targets und Distraktoren wieder erhöht (ebd.). Maith, Schwarz und Hamker (2021) fanden heraus, dass sich diese optimale Aufmerksamkeitsmodulation basierend auf dopaminerger Belohnung während einer visuellen Suchaufgabe entwickeln kann. Diesen Ansatz werden wir in dieser Arbeit auf das Cueing-Paradigma von Kerzel (2020) erweitern. In diesem Paradigma soll eine Target-Farbe unter drei Distraktor-Farben, die sich in ihrer Relation von der Target-Farbe unterscheiden (z.B. rötlicher), ausgewählt werden. Die Positionen dieser Reize werden zuvor farblich gecuet, wobei wir wie in Kerzel (2020) beobachten konnten, dass Cue-Farben, die näher an den Farben der Distraktoren lagen, die Aufmerksamkeit weniger auf sich gezogen haben als Cue-Farben, die weiter von den Distraktoren-Farben entfernt waren. Dadurch entstanden Cueing-Effekte für Farben, welche sich nicht um die Target-Farbe zentrierten, sondern asymmetrisch von dieser abwichen.

Optimization of the Visualization of ADAS Driver Information Displays using Requirements Elicitation and Meta-Heuristics for Parameter Tuning

Theresa Werner

Wed, 15. 9. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

Making the decisions taken by Advanced Driver Assistance Systems (ADAS) such as Adaptive Cruise Control (ACC) transparent to the driver by visualization is a key element of making the driver trust the system. In case of a Driver Information Display (DID) currently in development by Intenta, this visualization is taken to the next level: pictographic displays as a way to show the driver the gist of the events is replaced by a display that shows reality down to the vehicle type and distance in two dimensions. But in order not to lose the trust of the driver such a high level display must conform to more expectations regarding visualization than their predecessors. This thesis proposes the idea to use requirements as means to build a bridge between what human perception would describe as 'good visuals' (correctness of location and size of objects, smoothness, etc.) and the numeric evaluation of an automated optimization. Requirements are hereby the basis for objective error functions that evaluate the filter quality of the filter that decides which traffic participants are to be displayed on the DID in each time frame. It could be shown that requirements indeed can be used for this matter and hence open a way to optimizing the visualization of next-generation ADAS Displays. The optimization method used in this thesis is Differential Evolution (DE), a population-based meta-heuristic that can easily handle over 30 dimensions and varying parameter types. This thesis proposes a new selection mechanism for DE to decide which individuals are to survive into the next generation. It is a hybrid between the classic selection for single-objective optimization problems and the Pareto dominance approach for multi-objective optimizations. The hybrid shows potential for surpassing both its parents but it needs further analysis regarding its success rate.

Implementation and Evaluation of CNNs for Rail Track Detection in Camera and LiDAR Modalities

Vishal Vasant Mhasawade

Wed, 15. 9. 2021, https://teams.microsoft.com/l/meetup-join/19%3ameeting_MGMzZDg2MWUtNzhhNS00YzA4LWE2OTItNjZlYzJjNTI4ZDk5%40thread.v2/0?context=%7b%22Tid%22%3a%22a1a72d9c-49e6-4f6d-9af6-5aafa1183bfd%22%2c%22Oid%22%3a%220b93fe6e-6fee-40e5-9f88-a722b38c1945%22%7d

The railways have been the backbone of mass transportation for human beings and goods for over a century. Over time, the operations of the railway have been vastly automated. The safety has been one of the most important aspects of rail transit and with the rapid developments in the automation of rail transit, the safety aspect is gaining more importance than ever. The greatest number of accidents are reported from artificial train operation and or rail area obstacle collision [KP15]. Thus an active perception system in trains is the most obvious solution for the above problem. To digitize and automate the rail operation Deutsche Bahn Netz AG takes a pioneering role throughout Europe. They plan to develop an active perception system for trains. One of the basic functions of such a system in railways is rail track detection. Detecting rail tracks not only provides the precise driving region but also helps to delimits the region of interest for obstacle detection and. This thesis work conceptualizes and develops the CNN-based rail track detection algorithms. By taking the advantage of sensor array used in the active perception system, this thesis achieves track detection in Camera and LiDAR modalities. Additionally, this work put forth two evaluation metrics that enable the parametric evaluation of track detection functionality. The performance of track detection was evaluated against the functional and performance requirements set by DB Netz and the algorithms developed in this thesis were found to fulfill the majority of the requirements.

Tiefe Neuronale Netze zur Grabhügelerkennung in Digitalen Geländemodellen

Michael Linke

Wed, 18. 8. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

In den letzten Jahren gab es große Fortschritte in der Entwicklung automatisierter Verfahren zur Erkennung archäologisch relevanter Objekte in Digitalen Geländemodellen. Allerdings sind die entwickelten Verfahren zum einen nur bedingt praxistauglich, da zumeist hoch aufgelöste Daten zum Einsatz kommen und zum anderen die vorhandenen Studien schwer miteinander vergleichbar sind, da eine einheitliche Datenbasis fehlt. In dieser Arbeit wird versucht, mittels Deep Learning Grabhügel in DGM2-Daten zu segmentieren und zu detektieren. Zu diesem Zweck werden 21 Filter danach verglichen, wie effektiv sie den Lernprozess unterstützen, verschiedene Loss-Funktionen ausprobiert sowie ein Trainings- und Testset vorgeschlagen, das als erster Schritt hin zu einer frei verfügbaren Vergleichsbasis dienen könnte.

Supervising the results of autonomous machine learning systems with anomaly detection techniques

Abu Bakar Hameed

Wed, 28. 7. 2021, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Anomaly detection is a method of detecting unusual patterns in the data. It is one of the emerging areas in the machine learning field. There are countless applications of anomaly detection in approximately every field. Recently, the concept of Automated machine learning got famous, and various data scientists are benefiting from this methodology. Despite the automation, there is still a need to verify the produced results. Due to the higher number of Key Performance Indicators (KPIs), it is too time-consuming for data analysts to check all the results. To assist the data analysts, research is performed at Paraboost GmbH. The main idea of the study is to develop an automated solution to the problem. The study is divided into three phases: algorithm selection, prototype development and product development. In the first phase, different algorithms on a public dataset are compared. At the end of the first phase, a suitable algorithm is selected for prototype development. In the second phase, a prototype is built on machine learning outputs using the selected algorithm. While monitoring the results, it was found out that the majority of the anomalies are related to the customer data. This finding became the motivation for developing a product that could be useful for different businesses to monitor their data. In the third phase, anomaly detection was performed directly on the customer data, and various new modules were added. The significant finding of this work has resulted in the creation of the SAAS product Kpi.doctor, which is currently being rolled out and reviewed by clients. The product will be released by the end of Q4 2021.

A system levels model of motor learning

Javier Baladron

Mon, 12. 7. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

The neural basis of motor control are still far from clear. Although the motor cortex, the basal ganglia and the cerebellum have all been shown to be involved in the execution and learning of complex motor behaviors, the specific role of each and their interactions is yet under discussion. Here I will present a model of motor learning in the context of our previous hierarchical model of the multiple cortico-basal ganglia loops. In this approach, the motor cortex learns a set of motor primitives through Hebbian learning after the execution of random movements. At the same time it learns a map from each primitive to a set of parameters for the central pattern generator controlling each limb. The motor basal ganglia loop learns to select a motor primitive according to a desired direction of movement. The cerebellum learns to fine tune the parameters associated to each motor primitive according to the current task. I will show how the use of the task independent information initially gathered by the motor cortex can accelerate learning and increase performance in a reaching task.

A neuro-computational model of visual attention with multiple attentional control sets

Shabnam Novin

Mon, 21. 6. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

In numerous activities, humans need to attend to multiple sources of visual information at the same time. Although several recent studies support the evidence of this ability, the mechanism of multi-item attentional processing is still a matter of debate and has not been investigated much by previous computational models. Here, we present a neuro-computational model aiming to address specifically the question of how subjects attend to two items that deviate defined by feature and location. We simulate the experiment of Adamo, Pun and Ferber (Cognitive Neuroscience, 1 (2010) 102-110) which required subjects to use two different attentional control sets, each a combination of color and location. The structure of our model is composed of two components: 'attention' and 'decision-making'. The important aspect of our model is its dynamic equations that allow us to simulate the time course of processes at a neural level that occur during different stages until a decision is made. We analyze in detail the conditions under which our model matches the behavioral and EEG data from human subjects. Consistent with experimental findings, our model supports the hypothesis of attending to two control settings concurrently. In particular, our model proposes that initially, feature-based attention operates in parallel across the scene, and only in ongoing processing, a selection by the location takes place.

Introducing continuous fields, 3D representations for 3D deep learning

Aida Farahani

Mon, 14. 6. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

3D data models are used in many different areas such as entertainment, advertisement, and engineering that increase the demand for optimized, memory-friendly and computationally efficient algorithms on 3D data. With the help of deep networks such as autoencoders, some great work in 3D shape classification, data compression, reconstruction and most importantly data generation has been presented in the past few years. However, preparing proper 3D data for neural networks is still open for research, and different methods are suggested day by day. Voxels, point clouds, and meshes are the three main discrete representations of 3D shapes that have been used in deep networks so far. Despite the achievements and effectiveness of these approaches, not all data types are NN friendly, and bringing them into the deep learning domain is still challenging. 'Continuous Implicit Fields' or 'Signed Distance Functions' is an implicit representation in 3D space that has been recently combined with deep networks introduced as 'DeepSDF' and shows an enormous amount of interest and great potential in the deep learning context. In this talk, I will present an overview of some disadvantages of neural mesh processing and then investigate the benefits that DeepSDF could bring to the topic.

Performance of biologically grounded models of the early visual system on standard object recognition tasks

René Larisch

Mon, 31. 5. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

Although a wealth of models of the visual system exists in the area of computational neuroscience with the aim to replicate experimental data, improvements in the task of object recognition (one of the main tasks of the visual system) are recently done by models of the area of machine learning, which are optimized to achieve high accuracies on a specific dataset via the Backpropagation algorithm. However, neuro-computational models of vision must convince in object recognition as well and must be evaluated on such datasets. In recent years, only a few models with biologically grounded plasticity rules performed object recognition and mainly on the simple MNIST set. To establish object recognition as a standard evaluation task, I will present the accuracy values of two computational neuroscience models of the visual cortex on different datasets. Both models using Hebbian learning rules and learned in an unsupervised fashion on natural scene data to guarantee the emergence of receptive fields, comparable to their biological counterparts. We assume that the emerged receptive fields covering a wide range of different features, implementing a general codebook, and be suitable for different sceneries. The chosen datasets reach from the simple MNIST, the more complex CIFAR-10, and ETH-80, and we report the accuracy values for the excitatory and for the inhibitory population in every layer of both networks. Our observed accuracy values on the MNIST dataset are similar to previously reported ones. We also observed a slight decrease in the performance of the deeper layers. However, our results provide a broad basis of performance values to compare methodologically similar models.

Contrasting attentional processing in visual search, object recognition, and complex tasks

Frederik Beuth

Mon, 17. 5. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

Visual attention is known to be involved in task control, yet there is a lack of works comparing attentional processing in different tasks. We found, by means of a neuro-computational modeling study, that visual attention operates precisely diametrically in object localization known as visual search (OL), and in object recognition (OR). Furthermore, we predict how more complex tasks such as object substitution masking (OSM), which is composed of visual search and recognition, are realized, as we found an interplay takes place between both mechanisms according to the phases of the involved processes. By this cognitive control via attention, the brain might be able to realize many diverse tasks, and might adapt well to new tasks.Visual attention is known to be involved in task control, yet there is a lack of works comparing attentional processing in different tasks. We found, by means of a neuro-computational modeling study, that visual attention operates precisely diametrically in object localization known as visual search (OL), and in object recognition (OR). Furthermore, we predict how more complex tasks such as object substitution masking (OSM), which is composed of visual search and recognition, are realized, as we found an interplay takes place between both mechanisms according to the phases of the involved processes. By this cognitive control via attention, the brain might be able to realize many diverse tasks, and might adapt well to new tasks.

Deep Reinforcement Learning for waypoint following and obstacle avoidance

Jayrajsinh Parmar

Wed, 12. 5. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

Autonomous driving can help in reducing crashes and increase traffic flow. Many automotive as well as tech companies are in a race of developing SAE level 5 autonomous car. Autonomous cars operate with a more complex system which includes sensing the environment, localization, planning and control. This pipeline can be avoided using end-to-end control using deep reinforcement learning (DRL). The DRL has shown greater success in application like playing a game of Go [David Silver et al., ], DOTA-2 [Raiman et al., 2019]. It can also perform challenging tasks like robotic manipulation, navigation and control [Meyer et al., 2020] [Levine et al., 2015]. In this work, we use DRL for path following and obstacle avoidance in autonomous driving. OpenAI gym based simulation framework is created for training and testing path following and obstacle avoidance. The state-of-the-art proximal policy optimization DRL algorithms are used for training. The DRL-agent uses sensor information and path relative position and orientation of a car as an input. The agent outputs the steering angle based on the given observation. The DRL-agent is trained on different reward functions and analysed. We have used different combinations of reward functions. These combinations are composed of different objective like cross-track error, course error, obstacle avoidance. We first evaluated the path following task, multiplication of course error and cross-track error gives good result. We got 2.3m average cross-track error. For path following and obstacle avoidance task, the addition of the path following and obstacles avoidance reward function gives the better result. We got 4.0m of average cross-track error and 49% success rate.

Connecting a neural simulator with the iCub robot

Torsten Fietzek

Mon, 10. 5. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

In neurorobotics, there exists middleware to handle the varying robot hardware and neural simulators to setup easily and efficiently a cognitive architecture. However, the communication between these tools is often done individually and needs the knowledge of both sides. To facilitate and unify this communication, we present an interface realizing the communication between the iCub robot with the YARP middleware and the neural simulator ANNarchy. For an efficient computation, the interface is implemented in C++ and has a Python user interface. This enables the use with ANNarchy, since ANNarchy has also a Python interface. Further, it wraps the YARP code and allows robot data preprocessing. These data can directly be routed to a respective ANNarchy input population. In this way, any cognitive architecture implemented in ANNarchy can now be connected to the iCub robot with minimal effort. Furthermore, this unified framework allows the connection of separately developed architecture without handling different hardcoded communication and preprocessing. In addition, the interface implements a simple synchronization mechanism, allowing closed loop control without the need of real time conditions.

ANNarchy 4.7 - What determines the parallel performance of neural simulations ?

Helge Ülo Dinkelbach

Mon, 3. 5. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

The size and complexity of the neural networks investigated in computational neuroscience are increasing, leading to a need for efficient neural simulation tools to support their development. Many neural simulators are available for the development of spiking models, even though their performance can largely differ which was shown in several performance studies (e. g. Vitay et al. 2015, Dinkelbach et al. 2019, Stimberg et al. 2019). In previous talks I analyzed prototype implementations to demonstrate potential bottlenecks for the parallel execution of spiking neural networks. In the present talk I want to give insight to ANNarchy 4.7 where larger changes in the code generation should allow a faster computation and offer a more fine-grained control on the parallel execution. However, some bottlenecks still remain which I want to demonstrate on typical examples.

Investigation and Development of Methods for Visual Misuse Detection for Vehicles in Level 2 Hands-Free Automated Driving Mode

Sharath Panduraj Baliga

Wed, 28. 4. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

Automated vehicles use Driver Monitoring System (DMS) to detect and warn a distracted driver. But a DMS can be easily misused by using handheld devices mounted on the driver windshield area that masks the sight of road. Therefore, detecting the misuse is crucial to ensure a safe traffic system. This thesis work addresses this problem by introducing a misuse detection algorithm. This algorithm analyzes the driver?s eye gaze behavior to classify the driver?s visual intentions as misuse or normal driving by using Support Vector Machines (SVM). This work presents two misuse detection algorithms, namely Misuse Detection by Gaze Density (MDGDensity) and Misuse Detection by Gaze Dynamics (MDGDynamics). The algorithms use the video streamed by the driver monitoring infra-red camera that was placed on the steering column by facing towards the driver. As the name suggests, the MDGDensity algorithm uses density information of eye gaze intersection points on the driver windshield area, whereas MDGDynamics uses dynamics of the gaze intersection points to classify the driver?s gaze behavior. The performance of the algorithms shows that the misuses were well detected in different scenarios with low false-negative cases.

A computational model of the basal ganglia learns to preferentially explore past rewarded actions through synaptic plasticity in the STN-GPe projection

Oliver Maith

Mon, 26. 4. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

Biasing the selection of exploratory actions according to prior experience is a useful strategy in large, non-stationary environments. In a previous study by Baladron et al. (2019), it was shown that projections between the subthalamic nucleus (STN) and the external globus pallidus (GPe), two nuclei of the basal ganglia, can bias exploration toward specific actions in a computational model of the basal ganglia. It has been suggested that the STN-GPe connectivity pattern may contain information about previously rewarded actions. I will show that plasticity in the STN-GPe projection allows the basal ganglia model to learn this information during a stimulus-response task with constantly changing rewarded actions. I will present a learning rule based on simple long-term potentiation and long-term depression mechanisms that are similar to homeostatic plasticity. During the stimulus-response task, the model initially randomly explores all possible actions, but after changing the rewarded action several times, it preferentially explores the previously rewarded actions. I will further show that we observed a very similar change in exploratory behavior in humans during the same stimulus-response task.

Development of a Temporal Consistency Loss Function to Improve the Performance of Deep Neural Networks

Sharath Gujamadi

Wed, 21. 4. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

Deep neural networks for highly automated driving are trained on a large and diverse dataset and evaluated on a per-frame basis, still often times leading to temporally unstable results. For real-time applications, however, there is a need to also incorporate temporal characteristics based on video frames into the learning process to ensure stable predictions. In this thesis, we explore such a characteristic of the predictions of semantic segmentation networks. We propose a novel unsupervised temporal consistency (TC) loss function to be used in an additional fine-tuning step, penalizing unstable semantic segmentation predictions. We introduce a two-shot training strategy to jointly optimize for both, accuracy of semantic segmentation predictions, and its temporal consistency based on video sequences. We demonstrate that our training strategy helps in improving the temporal consistency of two state-of-the-art semantic segmentation networks on two different road-scenes datasets. We show an absolute 4.25% improvement in the mean temporal consistency (mTC) of the HRNetV2 network and an absolute 2.78% improvement on the DeepLabv3+ network, both evaluated on the Cityscapes dataset, with only a slight decrease in accuracy. When evaluating the same video sequences using a synthetic dataset Sim KI-A, we show improvements in both, accuracy (2.73% mIoU) and temporal consistency (0.73% mTC) for the DeepLabv3+ network. We confirm similar improvements for the HRNetV2 network.

Vergleich dreier Aufmerksamkeitsbasierter Künstlicher Netzwerke auf Gemeinsamkeiten und Unterschiede

Bastian Schindler

Mon, 19. 4. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

Innerhalb der Ausarbeitung werden die Modelle dreier Studien miteinander verglichen. Diese drei Modelle befassen sich alle drei mit der Objektlokalisierung und sind durch das Gehirn inspiriert. Dabei wird vor allem Wert auf die Aufmerksamkeitsmechanismen gelegt, da diese aus dieser Inspiration hervorgehen. Die Abhandlung befasst sich mit den Gemeinsamkeiten und Unterschieden der Modelle auf struktureller, als auch funktioneller Weise.

A large-scale neurocomputational model of spatial cognition: An update

Micha Burkhardt

Mon, 19. 4. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

Spatial cognition allows a person to process behaviourally relevant features in complex environments and to update this information during processes of eye and body movement. In this presentation, I will present some updates on the spatial cognition project. In an integrated framework, we use neurocomputational models of object recognition (Beuth, 2019), perisaccadic space perception (Bergelt and Hamker, 2019) and spatial memory and imagery (Bicanski and Burgess, 2018) to explore key aspects of spatial cognition in a virtual environment. After a brief introduction to the topic, I will present the progress since my bachelor?s thesis and outline the current challenges.

Solving simulated cable insertion tasks with Deep Reinforcement Learning

Markus Erik Engel

Fri, 16. 4. 2021, https://us02web.zoom.us/j/82779214499?pwd=ZWhSRitPdHJzNC9HVkpoWHRWMlJhUT09

The autonomous handling of linear deformable objects in industrial settings is challenging due to the flexible material properties. Solving cable insertion and many other complex tasks exceed the capabilities of conventional robotic techniques. Deep Reinforcement Learning (DRL) has successfully demonstrated its ability to manipulate linear deformable objects through autonomous interactions with the environment. The success of DRL methods heavily depends on the state information, reward function, or the number of samples, which are often adapted to the respective task. These challenges limit the deployment of DRL-methods in real-world scenarios so far. For a broader application, the DRL methods require the ability to learn from naturally obtained information combined with an adequate number of interactions. In this thesis, the model-free DRL algorithms DDPG and SAC are applied to cable insertion tasks by considering the challenges of visual-based and sample-efficient learning. An environment is created, that combines a simulation software, for modeling the cable behavior, with DRL-methods. Two distinct approaches are considered, which include grasping within their actions. These approaches use a position controller or determine the positional changes continuously. It is shown that cable insertion tasks can be solved through combining SAC - for the determination of the grasp position - with a position controller. On the other hand, DRL agents with continuous control show non-optimal behaviors and are found to be sensitive to visual-based learning.

Analysis of Internet Traffic - Intrusion Detection Systems

Ralf Brosch

Wed, 3. 2. 2021, https://us02web.zoom.us/j/88027471757

With the continuous increasing usage of the internet and technology, there is nowadays a higher need to prevent malicious access or even attacks from unknown sources. To detect harmful attacks against computer systems or networks, Intrusion Detection Systems (IDSs) help to identify deviating activities. In the past classical machine learning algorithms were used in IDSs, but recent research has shown that deep learning methods can be applied for better results. The most established data set for intrusion detection research is the NSL-KDD data set, which contains about 150.000 records in total. It turns out that the chosen method of feature reduction plays a major role for the accuracies of the proposed deep learning models. Interesting feature reduction methods are such as applying a genetic algorithm (GA) or a Convolutional Neural Network (CNN) before classifying the data with a long short-term memory (LSTM) architecture. The deep learning algorithms that are covered in this seminar are based on LSTM architectures due to the fact that intrusions can be considered as time-series data. An invasion is not a one-time event, but rather a compounding effect of anomalies in network behavior. As a prospect for further research in IDSs at the end of the presentation an idea with a Variational Autoencoder (VAE) as a feature reduction method is suggested.

Geometric Deep Learning: Part II - Graphs, Manifolds, and Spatial-based Approaches

Payam Atoofi

Wed, 13. 1. 2021, https://us02web.zoom.us/j/88027471757

In the previous presentation the principle of spectral-based convolution on non-Euclidean structured data was discussed. Although a brief introduction to spatial-based approaches were provided, it is worth discussing them in more detail as they have shown to be less computationally expensive as opposed to their spectral counterpart. Moreover, the intuition and simplicity that spatial approaches offer has helped these methods to shine brighter in recent years. However, regardless of the methods there still exist challenges in the field of Geometric Deep Learning, e.g. training Graph Neural Networks (GNNs) with Stochastic Gradient Descent (SGD) using mini batches. Such issues which could for instance be observed in node-wise classification tasks, further necessitates a solution when sampling nodes or subgraphs are required. Although a generalized framework in non-Euclidean domains, as of these presentations is still out of reach, but the analogy, and even the problem setting could be brought closer for both graphs and manifolds. Spatial-based methods on manifolds follow the same principle of using a diffusion operator in order to aggregate local information as it is the case in graphs. In the second installment of presentations on GDL, some of the most renowned examples and successes of these models on real-world problems would be discussed to hopefully become an incentive on further research in this field, such as antibiotic discovery, predicting molecule's aroma, and fake news detection.

Exploration and Assessment of Methods for Anomaly Detection in Time Series Signals

Ivan Zaytsev

Tue, 5. 1. 2021, https://us02web.zoom.us/j/84988730469

This thesis is devoted to the problem of anomaly detection in time series data. Some of the important applications of time series anomaly detection are healthcare, fraud detection, and system failure detection. This paper examines sensor signals used in the automotive industry. Three common types of signals were selected for the study. These signals are interesting for the research because they have different properties and characteristics. In addition to the signals themselves, various representations of the signals were also investigated. The problem of detecting anomalies in time series has become a burning one, especially in the modern world, because car manufacturers are trying to improve cars for the convenience of drivers. The amount of data received from these sensors is increasing and, consequently, there is a high need to use automatic systems for checking signals for anomalies. The purpose of this work is to explore existing methods for detecting anomalies in time series and assess their performance. Unsupervised machine learning algorithms such as LOF, COF, CBLOF, KNN, HBOS, and OCSVM were tested and assessed. The algorithms that showed the best results were implemented in the desktop application for use in the offline mode. For the online mode, such methods as MA (Moving Average) and a certain type of an artificial recurrent neural network LSTM were tested. That is, as a result of the research, a desktop application was developed and implemented to detect anomalies in time series. This desktop application is able to work in real time and in offline mode. This paper also offers methods for checking the properties of signals. These methods are able to detect the anomalous signal even in cases when the other algorithms can not. However, the utilization of these methods requires minimal knowledge of the signal. Almost all of these offered methods can be applied in both application modes.

A neuro-computational perspective on habit formation: from rats to Tourette patients

Javier Baladron

Wed, 16. 12. 2020, https://us02web.zoom.us/j/84719165578

Why do we do what we do? Actions could be either directed by their consequences (goal-directed behavior) or through a stimulus - response relationship (habitual behavior). In this talk I will first introduce you to reward devaluation: an experimental paradigm used to measure the development of habitual behavior. Then I will introduce a neuro-computational model of multiple cortico-basal ganglia-thalamo-cortical loops which we have used to simulate two recent devaluation experiments: one with rats running in a T-maze and one that involves Tourette patients. Our model does not only reproduce behavioral and neural data in both cases but also predicts that cortico-thalamic shortcuts, that bypass some of the loops, are critical for the development of habits.

Geometric Deep Learning: Part I - Graph Neural Networks

Payam Atoofi

Wed, 2. 12. 2020, https://us02web.zoom.us/j/88027471757

A lot of success of deep learning methods, particularly classical CNNs, is owed to the availability of the data, based on which these algorithms are built. However, despite the enormous amount of data, provided from social networks, chemistry and biochemistry in pharmaceutical research, 3D computer vision, etc. the methods which were successful previously on domains such as images, audio signals, etc. fell short in being immediately adopted to these new domains. The main reason is the inherent structure of these data which is non-Euclidean, as opposed to Euclidean structured data with grid-like underlying structure. The field of Geometric Deep Learning (GDL) is dedicated to address issues arising in non-Euclidean structured data. Due to the extensive research and endeavors of the community in GDL, this field is no longer an underdeveloped territory. Some of the prominent categories in this field are the attempts to apply embedding techniques on graphs, defining a mathematical-sound convolution operation on graphs inspired by Graph Signal Processing (GSP), defining spatial-based convolution e.g. message passing approach, and developing algorithms based on graph isomorphism. The challenges and shortcomings of many of the methods are now more understood. One of the immediate results brought the topic into a debate whether going deep in Graph Neural Networks is necessarily an advantage. The question regarding a generalized framework which is easily adopted in different tasks, e.g. node-wise classification, graph classification, etc. has not yet been answered, and remains application dependent. In this presentation these ideas are explored and discussed to see how some of these ideas developed into a state-of-the-art.

Stopping planned actions - a neurocomputational basal ganglia model

Oliver Maith

Wed, 25. 11. 2020, https://us02web.zoom.us/j/88027471757

In this presentation, we will first take a short tour through the field of computational modeling of the basal ganglia. Some interesting modeling approaches in the research fields of reinforcement learning, working memory and Parkinson's Disease will be presented and thereby, we will look at the general structure and some of the most popular assumed functions of the basal ganglia. After this more general part of the presentation, we will present our current work about simulating a stop-signal-task with our basal ganglia model. In such stop-signal-tasks, one must suddenly cancel a planned action due to an occurring stop cue. The recently proposed 'pause-then-cancel' model suggests that the subthalamic nucleus (STN) of the basal ganglia provides a rapid stimulus-unspecific 'pause' signal. This signal is followed by a stop-cue-specific 'cancel' signal from recently defined striatum-projecting neurons of the external globus pallidus (GPe) of the basal ganglia, the so-called Arkypallidal neurons. The purpose of our stop-signal-task simulations is to better understand the underlying neuronal processes of this 'pause-then-cancel' theory and the relative contribution of Arkypallidal and STN neurons to stopping. After an extensive review of the structure and connectivity of the GPe, we have completely revised the GPe of our basal ganglia model to include not only the Arkypallidal neurons but also recently described cortex-projecting GPe neurons. We replicate neuronal and behavioral findings of stop-signal-tasks and demonstrate that besides STN and Arkypallidal neurons also cortex-projecting GPe neurons are required for successful stopping. Further, we predict the effects of lesions on stopping performance. Our model simulations provide an explanation for some surprising or non-intuitive findings, such as stronger projections of Arkypallidal neurons on the indirect than on the direct pathway of the basal ganglia and the fact that Arkypallidal neurons become active during both stopping of actions and movement initiation.

Deep Learning for 3D Shapes Represented by Meshes

Anahita Iravanizad

Tue, 17. 11. 2020, https://us02web.zoom.us/j/88027471757

After the groundbreaking success of CNNs on images, where models on a variety of tasks could outperform humans, researchers have tried to adapt the convolution to different domains, e.g. on graphs, meshes, etc. These data however have introduced challenges to the classical CNN approach since the data does not have grid-like underlying structure. To tackle this issue, a range of different methods have been developed from data representation conversion (or dimensionality reduction) to a domain where classical CNN could be utilized, to defining convolution operations particularly for such domains where the filters are able to capture the information of a non-Euclidean structured data. Non-Euclidean structured data such as mesh representation of 3D objects, whether they are scanned or generated, are used in many models for tasks ranging from classification to semantic segmentation. MeshCNN, a network with convolution and pooling operations on edges of the meshes, has shown promising results in both classification and semantic segmentation tasks. On the other hand, MeshNet has been proposed to classify meshes using the meshes' faces as the unit of input. We have proposed a MeshNet-based architecture for a semantic segmentation task. To direct the model's learning beyond a point-wise segmentation, a weighted loss function has also been introduced to emphasize more on the faces with larger areas, and as a result an improved IoU of the area has been achieved. The effect of zero vs random replication padding has also been investigated. The model has been tested on COSEG dataset, containing samples of chairs, vases, and aliens. The results have shown that the model could perform segmentation on par with MeshCNN, on two out of three separate categories. Also, our proposed architecture has performed almost equally good with all three categories combined. This model has outperformed PointNet on COSEG dataset.

Introducing infrared target simulation based on cGAN derived models

Chi Chen

Wed, 11. 11. 2020, https://us02web.zoom.us/j/88027471757

After Ian Goodfellow introduced the GAN (Generative Adversarial Network) architecture, Yann Lecun called it the coolest thing since sliced bread. The interest in GAN in all domains took a blowout growth. The most important application of GANs is to generate natural-looking images in an unsupervised manner. This can solve the problem that not enough images are available for supervised learning. It also can create powerful programming frames for unsupervised learning as many computer scientists and AI engineers have already experimented. This seminar will start with the definition of GANs and how they generally work. Also, the main part is one particular type of GAN that involves the conditional generation of outputs called conditional GAN (cGAN). This part will derive more possible models and help to generate more images in diverse classes, such as infrared target simulation based on cGAN.

Deep Hebbian Neural Networks, a less artificial approach

Michael Teichmann

Wed, 4. 11. 2020, https://us02web.zoom.us/j/88027471757

In recent years, deep neural networks using supervised learning gained a lot of interest for modeling brain function. However, network connectivity and learning have been questioned for their biological plausibility. Also the increasing amount of discovered shortcomings, compared to the brain, raised the call for a less artificial intelligence. We present how three core principles from computational neuroscience can be combined to a system of the first visual cortical areas. These are: 1) Hebbian and anti-Hebbian plasticity with homeostatic regulations, to learn the synaptic strengths of excitatory and inhibitory neurons, causing independent neuronal responses, a key aspect for neuronal coding. 2) Intrinsic plasticity to control the operating point of the neurons, enabling all neurons to participate equally in the encoding and causing an informative code in deeper network layers. 3) Experience-dependent structural plasticity to modify connections during network training, allowing to observe the anatomical footprint of the learnings and to overcome biased initial definitions. We implemented the core circuit of the pathway from LGN to V2, which consists of nine neuronal layers, implementing excitatory and inhibitory neurons, and their recurrent connectivity. We demonstrate three important exemplary aspects, highlighting the value of this model class. 1) The general ability to do invariant object recognition, on MNIST and COIL-100, with competitive results to other unsupervised approaches. 2) The model develops realistic V2 receptive fields, from which we can derive predictions for differences in the sensitivity of the layer and neuron type on naturalistic textures, extending experimental observations. 3) The distribution of synaptic weights with respect to the response correlations of the connected neurons. We extend the common view on inhibitory connectivity as unspecific by its specific connection structure, which is difficult to observe experimentally. We link our findings to the specific role of inhibitory plasticity. Finally, we give an outlook for the next challenges and achievements highlighting deep biological neural networks as a promising research field to overcome limitations of state of the art deep neural networks and allow detailed insights in the brain functioning.

Towards general and robust industrial bin-picking: A Deep Reinforcement Learning based approach using simulation to reality transfer

Carl Gäbert

Fri, 2. 10. 2020, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

The ability to reliably grasp and lift arbitrary objects is a fundamental skill that is required for various industrial use-cases. To this end learning-based approaches can generate grasp poses from single sensor observations without the need of being provided with explicit object pose information. This thesis aims on learning the skill of grasping arbitrary items from cluttered heaps in simulation. For this a multi-stage approach is presented in which the graspable areas of the heap are extracted in an initial filtering stage. Next, a hierarchical learning approach is used to select from these candidates and define the final grasp pose using two different policies. By including information about the gripper dimension in the observation, it was possible to train the policies on a distribution of parallel grippers. The models could then be used in a real scenario with unseen objects and a new gripper configuration.

Improving Robustness of Object Detection Models using Compression

Sahar Kakavand

Thu, 24. 9. 2020, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Deep neural networks are often not robust to semantically-irrelevant changes in the input. In this work we address the issue of robustness of state-of-the-art deep convolutional neural networks (CNNs) against commonly occurring distortions in the input such as photometric changes, or noise. These changes in the input are often accounted for during training in the form of data augmentation. We have three major contributions: First, we propose a new pruning method called augmented-based filter pruning that can be used to prune the filters that react dramatically to the small changes of the input. Second, we propose robustness training that consists of a new regularization loss called feature-map augmentation (FMA) loss which can be used during finetuning to make a model robust to several distortions in the input. Finally, we propose the combination of pruning and robustness training that results in a single model that is robust to augmentation types that is used during pruning and finetuning. We use our strategy to improve the robustness of an existing state-of-the-art object detection network. In the course of our experiments, we trained Faster R-CNN object detection network on KITTI Vision Benchmark Suite dataset. Afterwards, we used robustness training on augmented baseline for different augmentation types. For combination of pruning and robustness training method we pruned 200 filters in each pruning step and finetuned the updated model using FMA loss and original loss. In total we pruned 40% of filters of our baseline. It is shown in the experiments that the accuracy of the network improved for both augmented and clean test set.

Voice authentication and Voice-to-Text tool based on Deep Learning in the area of plant maintenance - Industry 4.0

Anh-Duc Dang

Tue, 22. 9. 2020, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

As part of the development of future Industry 4.0 solutions at Robert Bosch GmbH, we investigate several speaker recognition and speech recognition models to be used in a plant maintenance use case. Our speech application should first authenticate a user through a voice sample. If the authentication is successful, speech commands from the user are transcribed into text which allows the user to conduct plant maintenance operations with the assistance of a chatbot. Speaker and speech recognition systems are tested on speech samples with background noise since the speech pipeline is planned to be implemented in a production plant. We evaluate two speaker recognition and three speech recognition models on an internal noisy dataset consisting of 12,000 audio files. On the speaker recognition task, Microsoft Azure's Speaker Recognition API achieves an accuracy of 99.7% while an implementation based on the ResNet-34 architecture achieves an accuracy of 93.9%. The results show that speaker recognition models can handle background noise fairly well. On the speech recognition task, we test the models without prior training. Microsoft Azure's Speech-to-Text achieves a 6.9% WER while Mozilla DeepSpeech and the Python SpeechRecognition library (with a PocketSphinx model) considerably struggle with background noise, achieving a 52.5% and 67.1% WER respectively. We compare the speech models on several other metrics like latency, costs, or data security to give an overview of speech models that could be deployed in a production plant.

Recognition of structural road surface conditions to support Highly Automated Driving

Varsha Shiveshwar

Thu, 10. 9. 2020, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Road surface detection is important in terms of safety of the vehicle as well as providing comfort. In this master thesis, traditional feature extraction techniques of computer vision and CNN techniques are used to classify the different road surfaces using the fixed size image patches with ten classes. The road features are extracted using texture based feature extractors, shape based feature extractors and combination of both. The four computer vision models used are GLCM, LBP, HoG and HoG with LBP. The same dataset is also trained and tested using a shallow CNN model and a CNN model where the image is pre-processed with Gabor filter. Ten-fold cross validation is used to evaluate the individual models. The behaviour of individual models on shuffled and unshuffled road sequences is noted. It is observed that although CNN performs better in terms of accuracy, the inference time is maximum when compared to the traditional computer vision methods used. Tests with the reduction in the number of classes proved to have a better accuracy than the tests with ten classes.

Schätzung der Eigenrotation eines Roboters basierend auf Gyrosensor- und Kameradaten

Julius Riegger

Tue, 8. 9. 2020, https://webroom.hrz.tu-chemnitz.de/gl/lor-hy4-uug

Die Bachelorarbeit beschäftigt sich mit dem Thema Roboterorientierung, speziell in dem Teilbereich Rotation. Dafür wurde untersucht, in wie weit visuelle Informationen (angelehnt an das menschliche Sichtfeld über eine Front-Kamera) dafür geeignet sind eine Rotation zu erkennen. Dabei stellt sich die Frage, ob diese Methode als Feedback-Signal für ein Orientierungsmodell dienen kann. Des Weiteren wurden Trackingmethoden zur Überprüfung der errechneten Rotationen getestet und ausgewertet.

Supervised learning in Spiking Neural Networks

Weicheng Zhang

Thu, 3. 9. 2020, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

In recent years, we more often talk about artificial neural networks (ANN) in the field of machine learning. However, ANNs are fundamentally different from the brain, as they are not biologically plausible, especially through the way that information propagates. Naturally, it led to a new type of neural networks - Spiking Neural Networks (SNN). The concepts of SNN are inspired by the biological neuronal mechanisms that can efficiently process discrete spatio-temporal events (spikes). Due to the non-differentiable properties of a spike, it is difficult to use conventional optimization for training SNNs in a supervised fashion. In my presentation, I will introduce the training approaches from Lee, J. H. et al (2016), Lee, C. et al (2019), and Zenke F. et al (2017) as possible solutions for supervised learning in SNNs.

Vehicle Brake and Indicator Light Status Recognition Using Spatio-temporal Neural Networks

Giridhar Vadakkumuri Parameswaran

Thu, 6. 8. 2020, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Detecting the intent of ado vehicle is a pivotal task in realising self-driving capabilities. Recognising the status of brake and turn indicator lights of the ado vehicle helps the ego vehicle to plan its trajectory. By drawing inspiration from the success of deep learning in solving video classification and action recognition tasks, the thesis investigates end to end deep learning techniques to solve the problem of taillight status recognition by utilizing visual data. We investigate the suitability of various deep learning models like CNN, LSTM, ConvLSTM, 3D convolutional networks, spatial attentions networks and their combinations to solve the task. These models are trained and benchmarked on a public dataset from UC Merced. All these models work on a sequence of images of the ado - vehicle and predict its brake and turn indicator light status. Our best method is able to outperform the state of the art in terms of accuracy on the RGB-UC Merced Vehicle Rear Signal Dataset, demonstrating the effectiveness of attention models and temporal networks for vehicle taillight recognition. We also compile and present two large datasets - Bosch Boxy taillight dataset and IAV taillight dataset, which can be utilized by other researchers for solving this task.

Deep Learning for 3D Shapes Represented by Meshes

Anahita Iravanizad

Tue, 14. 7. 2020, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

In 2012 AlexNet won the ImageNet challenge, which ultimately showed handcrafted filters could not compete with filters which are learned by a neural network. The idea that helped to give rise to deep convolutional neural networks, and introduced many networks in the following years until 2015 with ResNet which outperformed humans in the ImageNet challenge. Having seen such extraordinary performances by these deep networks, raised the question whether similar techniques could be applied in different domains. Domains that only recently have become the center of attention, from social networks, to 3D computer vision, and even drug design in pharmacy. However, defining a convolution operation on such data requires extra effort as these data do not lie on a regular grid-like structure, unlike images, audio signals, etc. do. Most of these data (without regular grid-like structure) could be perceived as graphs. Such abstract representation has helped to tackle the issue of the irregularity of their structure in seemingly different domains. One of these domains is 3D computer vision, in which 3D objects could be represented as meshes (a set of vertices, edges, and faces), which themselves can be interpreted as either graphs or manifolds. In this seminar, we first categorize different representations of 3D shapes, i.e. Euclidean and non-Euclidean, for each of which we introduce a few representations and their respective advantages and disadvantages. Then we briefly explore the deep learning achievements on each of these representations, except for the mesh-based networks which would be the core focus of the seminar, where we would elaborate in detail the architectures of two most recent mesh networks.

Learning Shape Based Features for Robust Neural Representation

Ranjitha Subramaniam

Thu, 25. 6. 2020, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Convolutional Neural Networks (CNNs) have gained tremendous significance over the years with state-of-the-art results in many computer vision tasks like object recognition, object detection, semantic segmentation, etc. Such high performance of CNNs is commonly attributed to the fact that they learn increasingly complex features while traversing deeper in their layers and this behavior is analogous to how humans perceive objects. Nevertheless, recent studies revealed that there exist considerable differences between human visual perception and the perception of objects by CNNs. One such substantial distinction is that humans predominantly rely on robust shape features to recognize objects while CNNs are highly biased towards local texture cues for object recognition. The perceptional differences between CNNs and humans can be reduced by improving the shape bias of CNNs. Recent work from Geirhos et al. showed that the augmentation of natural images using various styles from paintings makes their texture cues unpredictable and enforces the networks to learn more robust features. A CNN trained on such stylized images exhibits improved shape bias than a standard network trained on natural images. Besides the enhanced shape bias, such a network also demonstrates improved robustness against common image corruptions such as noise, blur, etc. The improved shape bias of the network is hypothesized to be the reason behind its high corruption robustness. With the objective to improve shape bias of CNNs, a technique, which employs edge maps with explicit shape details, is introduced in this thesis work. Moreover, the possible texture bias of the network is reduced by a technique called style randomization, which randomizes the statistics of activation maps in feature space. On evaluation, the proposed network shows higher shape bias. However, this shape biased network displays poor performance on image corruptions and its results are no better than a standard texture biased CNN. Hence, a systematic study is carried out to analyze the different characteristics in an image that could influence the corruption robustness. These characteristics include the existence of natural image properties, explicit shape details from edge maps and the stylized texture details. While stylization and certain preserved statistics of natural images play a role in improving the corruption robustness, no clear correlation is observed between the shape bias of a CNN and its corruption robustness. This study reveals that the strong data augmentation, which resulted from the stylization of natural images, helped in improving the corruption robustness of stylized networks while their improvement in shape bias emerged only as a byproduct. A further study is conducted to understand the adaptability of a network pretrained on natural images to data from different distributions. It is observed that the network shows improved performance on the target data while finetuning only its affine parameters of normalization layers. This indicates that a network trained using natural images also encodes robust representations but these representations are not leveraged in its affine layers.

Vision-based Traffic Sign Detection and Classification using Machine Learning

Ahmed Mohamed

Thu, 11. 6. 2020, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz

Traffic signs detection and classification is a key topic in modern autonomous driving systems. Machine learning methods, especially convolutional neural networks, have achieved a significant improvement in computer vision tasks and traffic sign classification problem. Traffic signs detection using CNN-based object detectors requires a substantial amount of computational resources and specialised hardware to achieve real-time performance. This thesis presents a shallow CNN to solve the traffic signs classification problem achieving comparable results to state of the art algorithms, and it presents a CNN-based object detector based on YOLOv3 that is able to achieve real-time performance by removing convolutional layers to reduce computation and re-organising the network structure. Model compression techniques are also applied to the resultant detection and classification model to reduce the size and computations in the model. The final detection model yielded nearly the same results as the original YOLOv3 model while achieving a significant reduction in size and real-time performance at 30 FPS on a general-purpose CPU.

Less Deep Neural Networks for Object Localization

Natasha Szejer

Thu, 30. 4. 2020, https://webroom.hrz.tu-chemnitz.de/gl/mic-cv7-ptw

Object localization is an important task to improve the performance of computer vision methods, with applications in e.g. robotics or object detection. Different approaches for this task exist: Inspired by neuroscience, a model according to the visual cortex system has been presented by Beuth and Hamker (2017). This model uses an attention mechanism originating in the Frontal Eye Field (FEF) and Prefrontal Cortex (PFC) to control a Higher Visual Area (HVA). Because the HVA controls the FEF, the process is recurrent and has shown to be promising for object localization. On the other hand, recent deep learning architectures have evolved, that can be trained end-to-end to a specific task and enable optimal specialization to a given dataset. In this research seminar, we investigate several state-of-the-art deep learning architectures for a comparison to Beuth and Hamker (2017). Inspired by the architectures presented, we will propose a concept that has similar structure as the model of Beuth and Hamker (2017) for a fair comparison. The findings of this preliminary investigation shall be used as a foundation for future evaluation in the scope of a master thesis.

A neuro-computational model of dopamine dependent interval timing using recurrent neural networks

Sebastian Adams

Mon, 24. 2. 2020, Room 131

Der reward prediction error (RPE) ist ein wichtiges Signal, welches einem Organismus mitteilt, ob etwas besser oder schlechter ist als erwartet. Dabei gibt es eine bestimmte Hirnregion, die vor allem mit diesem Signal assoziiert wird: das ventrale Tegmentum (englisch: ventral tegmental area, VTA). Es zeigt hohe Aktivität, wenn etwas besser als erwartet ist, normale Aktivität wenn eine erwartete Belohnung eintritt und ist genau an dem Zeitpunkt inaktiv, an dem eine erwartete Belohnung ausbleibt. VTA hat dabei weitreichende Verbindungen und leitet dieses Signal beispielsweise zum präfrontalen Kortex weiter. Dass VTA bei dem Ausbleiben einer Belohnung inaktiv ist, liegt an einer zeitlich präzisen Erwartung. Ein Hauptteil dieses Vortrages wird sich mit den zugrunde liegenden Prozessen der zeitlichen Kodierung im Gehirn befassen. Im Gegensatz zu dem gerade angesprochenen RPE, ist die Frage, wie das Gehirn Zeit kodiert, sehr umstritten. Ein Problem dabei ist, dass es keine Prozess im Gehirn gibt, der ohne Zeit beschreibbar ist. Aufgrund dieses Problems gibt es ein weites Feld an möglichen Ansätzen. Ein interessanter und noch relativ junger Ansatz ist dabei an das reservoir computing angelehnt. Ihm liegt die Annahme zugrunde, dass Zeit durch das Auslesen der Aktivität vieler Neuronen eine intrinsische Komponente neuronaler Netzwerke ist (sogenannte population clocks). Dieser Ansatz wird in das Modell von Vitay und Hamker (2014) zu den afferenten Verbindungen zu VTA eingebettet und anschließend diskutiert.

Entwicklung einer GUI für den Neurosimulator ANNarchy

Feng Zhou

Mon, 17. 2. 2020, Room 131

Graphische Benutzeroberflächen (GUI) sollen die Benutzung von Software erleichtern. Im Fall von Neurosimulatoren betrifft das vor allem die Definition von neuronalen Netzwerken. In meiner Bachelorarbeit entwickelte ich eine GUI für den Neurosimulator ANNarchy. Mit deren Hilfe sollen Benutzer neuronale Netzwerke modellieren und Simulationen durchführen können. In meinem Vortrag werde ich meine Ideen und Schritte bei der Entwicklung der GUI erläutern und die entwickelte GUI demonstrieren. Im Anschluss werden offene Punkte diskutiert und verschiedene Ideen für zukünftige Verbesserungen gezeigt.

Dense Descriptor Generation for Robotic Manipulation

Vishal Mhasawade

Mon, 17. 2. 2020, Room 131

Robotic systems either the mobile ones or the manipulators are on the verge of becoming increasingly commonplace. One of the most important reasons behind this is the intelligence which is being imparted into such systems. In order to make this intelligence more efficient, arguably visual input to the robot is one of the most sought ways. Having said this, there arises the question what is the best visual representation of robot's world for manipulation? For robots to manipulate any object (rigid/non-rigid) in front of them the object structure plays an important role. Along with the structure of the object the visual representation robots see has to be applicable to wide variety of tasks. This work implements Dense Object Nets as the way of creating a visual representation of robot's world which is called as dense descriptors. We are trying to learn a visual representation of the object which should be (i) task-independent and which can be used as basic building block for various manipulation tasks. (ii) This visual representation is generally applicable to rigid and non-rigid objects, (iii) also it is being learned with complete self-supervision. In short, we are trying to answer the question what is the current visual state of robot's world? By creating the dense descriptor representation. The ultimate milestone would be to enable the robot to perceive the world as we humans are able to do and Dense Object Nets can be called as a small step towards it.

Design and Modelling of Soft Pneumatic Actuator to Support Human's Forearm Motion

Hirenkumar Gadhiya

Thu, 13. 2. 2020, Room 131

Wearable robots offer pertinence and usability beyond many other types of technological interface and can include various applications such as personal entertainment, customized health, and fitness. Wearable robots improve the wearers' ability to interact physically with the human and external environment and gain rehabilitation efficiency. Due to their stiffness and wearable property, one can believe in wearable robots. For designing any exosuit, the first main important step is modelling of exosuit. Modelling of soft robotics is not an easy task. The aim of this research project is, therefore, to design and model the exosuit that is useful to support the human forearm movement. The online programming of behavior of a soft textile inflatable actuator is proposed in this research. In addition, forward and inverse kinematics are solved to get the desired tip position, and base frame, respectively. The structure of the exosuit is also discussed which consists of one pneumatic actuator to control flexion and extension movements, development board ESP-32, and the proportional valve has been developed for precise control of the actuator. At the end of the work, the comparisons are made to prove that the data obtained from the theoretical modelling match the data obtained from simulation.

Auswirkungen von Distraktoren auf überwachte und unüberwachte Lernverfahren

Michael Göthel

Mon, 3. 2. 2020, Room 204

In dieser Arbeit wurde untersucht, welche Auswirkungen ein Störfaktor, der in dieser Arbeit als Distraktor bezeichnet wird, auf das Lernen zweier untersuchter Netze, ein Deep Convolutional Neuronal Network (DCNN) und ein unüberwacht lernendes Netz, hat. Bei den verwendeten Netzen handelt es sich zum einen um das von LeCun et al. (1998) vorgestellte LeNet-5 als DCNN, zum anderen wird das bereits von Kolankeh (2018) vorgestellte Netz als unüberwachtes Netz genutzt. Es wurden die Klassifikationen der Netze sowohl mittels der Layer-Wise Relevance Propagation (LRP) untersucht und außerdem die Aktivitäten der Neuronen selbst. Dabei konnten bereits frühere Resultate, beispielsweise von Lapuschkin et al. (2019), welche eine starke Anfälligkeit eines DCNN für einen solchen Distraktor gezeigt haben, nachempfunden werden. Es konnten allerdings dieselben Eigenschaften auch für das unüberwachte Netz festgestellt werden. In verschiedenen Resultaten, welche mit und ohne Hilfe des Klassifikators erstellt wurden, konnten hier Hinweise gefunden werden, welche auf eine ähnliche Beeinflussung des unüberwachten Netzes durch den Distraktor schließen lassen.

Federated Pruning of Semantic Segmentation Networks Based on Temporal Stability

Yasin Baysidi

Mon, 27. 1. 2020, Room 204

Deep Convolutional Neural Networks (DCNN) are used widely in autonomous driving applications for perceiving the environment based on camera inputs. These DCNNs are used particularly for semantic segmentation and object detection. However, the number of trainable parameters in such networks is high and can be reduced without decreasing the overall performance of the network. On the other side, the performance of these networks are always assessed with conventional assessment methods, such as mean Intersection over Union (mIoU), or loss, and not towards other requirements, like stability, robustness, etc. Based on that, we propose a novel temporal stability evaluation metric and also study the impact of removing parts of the trained network, which tend to be unstable after training. This master thesis consists of two parts: 1) a novel method to define the temporal stability of semantic segmentation methods with sequential unlabeled data named Temporal Coherence, and 2) a novel pruning method, which reduces the complexity of the networks towards temporal stability enhancement named Stability based Federated Pruning. In the coarse of our experiments, two semantic segmentation networks, Fully Convolutional Networks FCN8-VGG16 and Full Resolution Residual Network (FRRN) are trained on two data sets, Cityscapes [9] and a Volkswagen Group internal data set. Afterwards, they are pruned with two state-of-the-art pruning methods along with our proposed method, and evaluated on Intersection over Union as a supervised and our Temporal Coherence as an unsupervised evaluation metric. It is shown in the experiments that the overall performance (mIoU) and the Temporal Coherence of the networks improved after pruning up to more than 40 percent of the network parameters. Furthermore, we have shown that we could produce competitive results by our pruning metric compared to the other state-of-the-art pruning methods in all the experiments, and outperformed them in some of cases.

Cooperative Machine Learning for Autonomous Vehicles

Sebastian Neubert

Thu, 23. 1. 2020, Room 368

Automated driving has been around for more than half a century till now and the approaches vary noticeably in the car industry. While some manufacturers and research institutions rely on a combination of multiple sensors like Lidar, Radar, Sonar, GPS and Camera, Elon Musk, the CEO of Tesla, is convinced to solve fully autonomous driving by primarily solving vision, as inspired by what human beings are using to make driving decisions, i.e. vision in first place. Current state-of-the-art approaches for object detection are entirely based on machine learning techniques. These involve training very complex models on huge amounts of data centralized in large-scale datacenters. Due to the fact that in modern applications like autonomous driving and edge computing, data is usually generated in a decentralized way, a feasible consideration would be to also train the machine learning models in a decentralized manner. In this thesis we examine a distributed learning approach called Federated Learning (FL) by applying it on several scenarios with MNIST as the dataset. In these settings, multiple clients are led to personalize on a specific digit whose models are then aggregated into an average model. We have made an in-depth analysis of how this algorithm is performing on these scenarios. Additionally, we propose several ways of improving the accuracy up to 97 % on the test set as well as the consideration that the principles of FL are not limited to neural network based learning algorithms but, for instance, can also be applied to SVMs.

A neuro-computational approach for attention-guided vergence control using the iCub

Torsten Follak

Mon, 13. 1. 2020, Room 204

In this thesis a combined model for attention-guided vergence control is presented. The model consists of two prior models. One model for vergence control by Gibaldi et al. (2017), implementing a biologically inspired approach for the robotic vergence control. The second part is a model for object localization by Beuth (2017), which is inspired by the attention mechanisms in the human brain. The connection of these two models should lead to a new model with an attention guidance mechanism for the vergence control. This thesis presents first the grounding models. Further, the necessary adaptions for the model fusion are shown. Finally, the performance of the new model is tested in different settings.

Real time head pose estimation received from 3D sensor using neural networks

Dhanakumaresh Subramani

Thu, 9. 1. 2020, Room 368

Human-machine non-verbal communication can be inferred from the human head pose tracking. Therefore, human head pose estimation is very crucial in person-specific services such as automotive safety. Bayesian filters like Kalman filter is one among the efficient visual object tracking algorithm. The popularity of Kalman filter is because of its inherent feedback loop, which can predict the forthcoming measurements. Nevertheless, it cannot be used widely because of its complex design, and it has to be micro specific to the task. Recent studies in RNN (Recurrent Neural Network) prove that it could be an ideal replacement for the Bayesian filters as the temporal information in RNN has a significant influence in the field of visual object tracking. The feedback loop in RNN allows storing the temporal state information of the entire sequences of the event. Additionally, RNN can perform the functionalities of CNN (Convolutional Neural Network) and Kalman filter. Moreover, notable improvements in CNN architectures are studied, such as learning multiple related tasks (human head pose estimation, facial landmark localization and visibility estimation) improves the accuracy of the main task. Hence, in this thesis, a recurrent multi-tasking network is designed, which can estimate the head orientation along with facial landmark localization.

Modeling goal-directed navigation based on the integration of place cells in the hippocampus and action selection in the striatum

Daniel Lohmeier von Laer

Fri, 13. 12. 2019, Room 132

In my master thesis, two computational models of brain areas involved in navigation, precisely a hippocampal place-cell model and a model of the basal ganglia, were investigated, and a link between them has been established. The hippocampal formation supposedly serves as a flexible cognitive map required for orientation and navigation. Whereas, the basal ganglia presumably map stimuli to responses and are supposed to be used for action selection and suppression. Hippocampal place-cells do not only code for an animal's current position but also shift forward at choice points along possible future paths. This information alone is not enough for action selection, as it requires an interface to motor areas, which is assumed to be represented in the nuclei of the basal ganglia. For the modeled link, a mapping of the information that is relevant for navigation from the place-cell model to the basal ganglia model had to be found. Furthermore, the basal ganglia model had to learn to read sequences and classify them into discrete directional categories. There were two setups for the experiments to test the created algorithm: a T-maze and a plus-maze. Both received two possible sequences that were to be converted into a left- or right-decision.

Visual semantic planning using deep successor representations

Anh-Duc Dang

Mon, 9. 12. 2019, Room 204

In this seminar, I will present the paper 'Visual Semantic Planning Using Deep Successor Representations' by Zhu et. al (2017). Visual semantic planning (VSP) is the task of predicting a sequence of actions in a visual dynamic environment that moves an agent from a random initial state to a given goal state. Given a task like 'putting a bowl in the sink', through interaction with a visual environment, an agent needs to learn about the visual environment's object affordance and possible set of actions (as well as all their preconditions and post-effects). First, I will give a brief introduction to important reinforcement learning concepts and then explain in detail the paper's model and Zhu et. al's approach of solving the VSP problem in the challenging THOR visual environment. The main idea of the model is successor representations, which can be considered as a trade-off between model-based and model-free reinforcement learning. The idea of successor representations is not recent (Dayan, 1993) but attracted a lot of attention in the neuroscientific, machine learning or deep reinforcement learning communities in recent years.

Wireless Human Body Motion Tracking Using IMU

Keyur-Meghjibhai Hapaliya

Thu, 5. 12. 2019, Room 368

Wearable human motion capture systems are developed for continuous recording and tracking of the human body. Continuous monitoring of these people could be carried out by wearable sensor systems even though in an unknown environment. Inertial measurement unit (IMU) is very well known for human body motion tracking analysis because of compact in size and low cost. Hence, this research project represents a wireless network of the Inertial measurement unit (IMU) for human body motion tracking. Several BNO55-Adafruit sensors have been attached to the human body. Each wireless module, by 'Espressif Systems', transmits sensor's data to another module using wireless communication protocol such as User datagram protocol (UDP) and Transmission control protocol (TCP). Serial data feeds into an animation to visualise real-time motion of the human body.

Erkennung von nonverbalen Geräuschen mit Machine Learning Ansätzen in Audio-Signalen

Max Rose

Mon, 2. 12. 2019, Room 204

In dieser Bachelorarbeit untersuche ich das Erkennen von nonverbalen Geräuschen in beliebigen Audio-Signalen mittels Deep Learning, einem jungen Feld des maschinellen Lernens. Die großen Herausforderungen bestehen in der Sammlung von ausreichend Daten, der richtigen Aufbereitung dieser sowie dem Design und Training der Netze. Konkret verwende ich hierfür Convolutional-Neural-Networks, Fully-Convolutional-Neural-Networks und Convolutional-Recurrent-Neural-Networks in einem Ensemble. Um ausreichend Trainingsdaten zu generieren entwickle ich eine Methode zur Datenaugmentierung. Um das Rauschen der Ausgabe zu entfernen entwickle ich einen Algorithmus, der sich die zeitliche Abhängigkeit des Audio-Signals zunutze macht. Das Ergebnis meiner Arbeit kann für die binäre Klassifizierung zwischen 'Menschliche Stimme' und 'Nonverbales Geräusch' eines belieben Audio-Signals verwendet werden, hat eine Genauigkeit von über 95% auf den Testdatensatz und ist auf die deutsche und englische Sprache trainiert.

Processing of automotive signals for fault diagnosis using an automated probabilistic Machine Learning Framework

Nishant Gaurav

Thu, 28. 11. 2019, Room 368

The study of automotive signals for fault diagnosis can be proved substantial for securing human life. These automotive signal contains data which can be mined for information retrieval. This mining of information can provide us specifications which can be useful for fault detection and diagnosis. This thesis focuses on finding the parameters and approaches that can be applied to a data set to extract signal traces that lead to a precise event. Signal traces are extracted with the help of different processes that are in the framework of (Mrowca, 2020). Best model for every process is searched for which in turn needs the best parameters for building those models. The work starts with searching the best parameters for clustering the signals which further is used for learning a network from which paths can be determined for the causes of the event. Parameters for learning the network are estimated and the best structure is used to define the network. Paths in these networks will provide specifications and the likelihood for each path is calculated. The path that will provide the maximum likelihood is chosen as our specification. Three data set is provided and each data set contains a physical experience. Specifications for these experiences are mined out.

Spatial cognition in a virtual environment: Robustness of a large scale neuro-computational model

Micha Burkhardt

Mon, 25. 11. 2019, Room 204

Spatial cognition is a key aspect of human intelligence. It allows a person to process behaviorally relevant features in a complex environment and to update this information during processes of eye and body movements. To unfold the underlying mechanisms of spatial cognition is not only paramount in the fields of biology and medicine, but also constitutes an important framework for the development of cognitive agents, like robots working in real-world scenarios. In my thesis, a large scale neuro-computational model of spatial cognition is used on a virtual cognitive agent called Felice. The model simulates object recognition and feature-based attention in the ventral stream, supported by spatial attention in the dorsal pathway. Furthermore the visual models are complemented by a model of spatial memory and imagery of medial temporal and retrosplenial areas. Multiple novel features, which were independently developed in other works, were integrated to further improve the performance of the model. This enables the agent to reliably navigate in a familiar environment, perform object localizations and recall positions of objects in space. After an overview of the structure and functionality of the model, an insight into the ongoing evaluation of the features and robustness will be given.

Increasing the robustness of deep neural nets against the physical world adversarial perturbations

Indira Tekkali

Thu, 14. 11. 2019, Room 368

Adversarial examples denote changes to data such as images which are imperceptible (or at least innocuous) for a human but can fool a machine learning model into misclassifying the inputs. Recent work has shown that such adversarial examples can be made sufficiently robust such that they remain adversarial (fool a classifier) even when placed as artifacts in the physical world. This may pose a security threat for autonomous systems (for e.g. recognition system, detection system etc) such as autonomous cars. Many approaches for increasing robustness of models against adversarial examples have been suggested but with limited success so far. However, the space of physical-world adversarial examples is much more constrained (i.e. smaller) than the general set of adversarial examples. Thus, increasing the robustness against these physical world adversarial examples might be more viable. The goal of this thesis is to assess which of the existing methods are particularly suited for defending against physical-world adversarial examples. Firstly we have decided to use the projected gradient descent (PGD) attack along with Expectation Over Transformation (EOT) to generate the physical adversaries also we tested with other attacks such as FGSM and BIM but compare to the PGD attack FGSM and BIM are weaker so we neglected, though BIM is some what better compare to the FGSM but its not strong enough to generate the stronger perturbations moreover it require more amount of time to generate adversaries. For defending the model against physical adversaries we have selected the adversarial training. We ran our simulations on two different types of dataset such as CIFAR-10 and ImageNet dataset (200 classes with an image size of 224X224). Due to the current lack of a standardized testing method, we propose a evaluation methodology, we evaluate the efficiency of physical adversaries by simply attacking the model without EOT and we achieved 57.47% as adversarial accuracy on CIFAR-10 and 64.67% on ImageNet and when we attack the model with EOT along with the PGD we received 72.64% on CIFAR-10 and 67.84% on ImageNet.

4D object tracking using 3D and 2D camera systems

Shubham Turai

Mon, 4. 11. 2019, Room 131

4D object tracking is the tracking of objects in real-world by considering the 3D position and movement of the object in time. When using 2D camera systems, like traditional monocular cameras, it is considered that all objects are moving on the ground plane or a constant horizontal plane. The objects to be tracked are detected using You Only Look Once (YOLO) (an object detection algorithm), and these detections are used to track the object. The performance of the tracking is mostly dependent on the accuracy of the detection algorithm. Different 2D trackers are shown: Centroid Tracker, an extension of Centroid Tracker using a well known single object Kernelized Correlation Filter (KCF) tracker, an extension of Centroid Tracker using Consensus-based Matching and Tracking (CMT) Tracker, the Simple Online and Realtime Tracking (SORT) and DeepSORT (a feature-based extension of SORT). The SORT is a spatial tracking algorithm which uses Kalman filtering to make predictions and corrects the detections later. 3D versions of Centroid Tracker, KCF Centroid tracker, SORT, and DeepSORT are implemented to study how the added spatial dimension improves the reliability of the trackers. The extended 2D algorithms and all the 3D algorithms were devised by author. An algorithm to transform the 2D detections to 3D detections is explained, and also the accuracy of the resultant 3D detections is checked. In this thesis, I study the importance of 3D detections (from world coordinates) upon 2D detections (from image plane) to track objects. I also study the importance of feature and spatial tracking, upon just spatial tracking. The 2D YOLO detections are converted to 3D using Visual geometry, considering that the objects are moving on the ground plane. The 2D bounding boxes are projected to get 3D bounding boxes to track the objects in 3D space. Having known the camera position and velocity at all times by using Inertial Measurement Unit (IMU) and Gyroscope so that the system provides a complete pose (position, altitude, and time) information. The existing camera intrinsic and extrinsic transformation matrices allow transforming from image coordinates into 3D coordinates. When using 3D (stereo) camera systems the position can be calculated like above, but the assumption that the objects are on the ground (a co-planar surface) is no longer required because the stereo camera provides the missing depth information.

Simulating Tourette Syndrome. A Neurocomputational Approach to Pathophysiology.

Carolin Scholl

Wed, 30. 10. 2019, Room 132

The talk starts with an overview of the symptomatology of Tourette syndrome and current hypotheses regarding the pathophysiology of tics, focusing on potentially disturbed signaling in the cortico-basal ganglia-thalamo-cortical loops. Based on the analogy between tics and habits from both a reinforcement-learning and neurophysiological perspective, a recent behavioral finding is highlighted: unmedicated adult Tourette patients responded towards devalued outcomes at a higher rate than healthy control subjects, which was interpreted as predominant habitual behavior in the outcome devaluation paradigm (Delorme et al., 2015). This behavioral effect can be replicated using a novel neurocomputational model with multiple, hierarchically organized cortico-basal ganglia-thalamo-cortical loops. First, I will explain the task modelling and introduce the ?healthy? model that successfully reproduces the behavior of healthy control subjects in the study. Next, I will present the findings from several sets of experiments, which entailed the systematic variation of model parameters. Some of these ?pathological? models indeed show increased rates of response towards devalued outcomes, as observed experimentally. Specifically, the behavioral effect can be reproduced by decreasing striatal disinhibition or enhancing dopaminergic modulation in the model. Regarding dopaminergic modulation, both an increase in the difference between tonic and phasic dopamine levels and manipulating the gain of striatal neurons is effective. Further analyses of the computational model reveal that striatal disinhibition and enhanced dopaminergic modulation in the basal ganglia may both create an imbalance between the indirect pathway and the direct pathway, in favor of the latter.

What determines the parallel performance of spiking neural networks?

Helge Ülo Dinkelbach

Mon, 28. 10. 2019, Room 204

The size and complexity of the neural networks investigated in computational neuroscience are increasing, leading to a need for efficient neural simulation tools to support their development. Many neural simulators are available for the development of spiking models, even though their performance can largely differ which was shown in several performance studies (e. g. Vitay et al. 2015, Dinkelbach et al. 2019, Stimberg et al. 2019). In this talk I want to go into more detail on the (parallel) implementation of core functions required by the simulation of spiking neural networks. For these functions I will demonstrate the influence of different aspects such as hardware platform, data structures and vectorization.

Object classification based on high resolution LiDAR

Megha Rana

Mon, 28. 10. 2019, Room 132

Technological revolution and vigorous data growth can be considered as a backbone of future autonomous driving. To understand the surrounding scene more accurately, the vehicle has to rely upon an inevitable information source provided by sensors like the camera, LiDAR, radar, etc. Apart from this, LiDAR sensors are widely adapted and qualified to locate objects with better accuracy along with precision; a LiDAR is capable of producing a 3-dimensional point cloud from the environment to detect and classify the scene into different kinds of objects such as cars, cyclists, pedestrians, etc. Many research works have employed Velodyne LiDAR with 64 channels for the classification task. In this thesis, high-resolution LiDAR having 200-meter detection range and 300 vertical channels, is utilized to deal with point cloud sparsity challenge. Moreover, Velodyne HDL-64E LiDAR data from Kitti raw data recordings are taken as reference (to provide a basis for comparison with high-resolution LiDAR). And, the classification is implemented on a point cloud data with a proposed feature extraction based classification method instead of a deep learning approach (as it is limited to the huge amount of data to train a model). For that, traditional machine learning algorithms (using a supervised learning approach) such as Random Forest, Support Vector Machine and K-Nearest Neighbors are chosen. Considering an appropriate training workflow followed with data exploration and pre-processing, these algorithms are trained on the different data distribution and relevant features. Furthermore, offline and online (such as city and highway scene) classification results per each algorithm under different categories are evaluated on the basis of the mean accuracy evaluation metric. To discuss about results, Random Forest and Support Vector Machine achieved the best overall mean accuracy (from all 3 feature sets including 4 classes such as car, cyclist, motorbike, pedestrian) around 87 to 91% for offline as well as online highway scene while receiving 71 to 80% for the online city scene.

Scalable Deep Reinforcement Learning Algorithm for Multi-Agent Traffic Scenarios

Niyathipriya Pasupuleti

Mon, 21. 10. 2019, Room 131

Autonomous cars are soon realizable on streets of modern day world. This gives scope to many advantages as well as disadvantages on traffic of roads. The project work of this Thesis associates with analyzing the behaviour of many such agents in mixed autonomy on particular traffic scenarios by simulation. The activity includes development of a Simulation environment for particular traffic scenarios. Autonomous cars, referred as agents are trained using a Deep reinforcement learning algorithm and simulated for various traffic scenarios. The behaviour of multiple agents in mixed autonomy traffic is analyzed. The traffic flow is desired to be stabilized through inclusion of trained agents. It is also influential through different parameters and environment used. Each Reinforcement learning agent is trained with a policy and awarded with different rewards. Scalable algorithm for training the agent is determined through rewards and performance analysis of RL agent.

Active Learning for Semantic Segmentation of Automotive Images

Geethu Jacob

Thu, 17. 10. 2019, Room 131

Convolutional Neural Networks perform computer vision tasks with high accuracy. However, these networks need an immense amount of data to do so. Collection of such a high amount of data is a difficult task. Labelling the collected data is even more tedious and time-consuming. Active learning is a field in machine learning which tries to select a subset of relevant samples from a pool of unlabelled data such that, a model trained with the subset can perform equivalent to one trained with the entire dataset. Thus, active learning helps in reducing the amount of labelling done as only the selected samples to be annotated. Semantic segmentation is a scene understanding task which assigns a class to every pixel in an image. This requires per-pixel labelling for all images. Active learning heuristics could be used to limit the annotation performed for segmentation. Two active learning strategies: a core-set approach which treats active learning as a cover set problem and uncertainty sampling which selects the samples which a model is not confident about are experimented for semantic segmentation. Monte Carlo dropout is used to select the uncertain samples. Results of two active learning methods are compared to the baseline where samples are selected at random from the dataset. Further analysis of the samples selected by each strategy and possible improvements in the future are also presented.

Models for Object Recognition and Localisation with visual attention and Spike-timing-dependent plasticity

Tingyou Hao

Thu, 17. 10. 2019, Room 368

Object recognization and localization are always two important tasks in computer vision. Attention is used as a spatial pre-selection stage in the computer vision. The model, which was proposed by Beuth and Hamker, treats the view of attention as cognitive, holistic control process. The primary vision (V1) extracts the features of input image. And the higher visual areas (HVA) organize the target object, The activates cells in the prefrontal cortex (PFC) encode the target object's type. And a recurrent loop, from Higher Visual Areas to Frontal Eye Field (FEF), and afterwards back to Higher Visual Areas, enhances the position with soft-competition. The target object is localized, if the position reaches the threshold. It demonstrates that neuro-computational attention models can be applied in the realistic computer vision tasks. Based on the recent discoveries, a nuero-computational approach - Spatial updating of attention across eye movements, which is proposed by J. Bergelt and F. H. Hamker,illustrates the lingering effect from the late updating of the proprioceptive eye position signal and remapping from the early corollary discharge signal. The results provide a comprehensive framework to discuss multiple experimental observations occuring around saccades. The model, STDP-based (Spike-timing-dependent plasticity) spiking deep neural network (SDNN), consists of a temporal-coding layer followed by a cascade of consecuitive convolutional and pooling layers. In the first layer, the visual information of input image is encoded in the temporal order of the spikes. In the convolutional layers, neurons integrate input spikes and emit a spike after reaching their thresholds. The visual features are learned in these convolutional layers with STDP. Pooling layers translate invariance and compact the visual information and transfer the compacted information to next layer. At the end, a classifier with support vector machine (SVM) detects the category of the input image. The models and the approach analyze the attention in the human brain in un- and supervised learning types.

Vorhersage des Kundenverhaltens mithilfe von maschinellem Lernen

Christian Dehn

Thu, 19. 9. 2019, Room 132

Kundenanfragen bei einem Energieunternehmen, das Strom- und Gastarife anbietet, sollen in möglichst kurzer Zeit bearbeitet werden. Dadurch verringern sich die Kosten und gleichzeitig steigt die Kundenzufriedenheit. Chatbots sind fur diese Aufgabe jedoch nicht immer eine Alternative. Daher soll herausgefunden werden, ob sich mithilfe von maschinellem Lernen erkennen lässt, ob, wann, wie und warum sich ein Kunde bei dem Unternehmen meldet. Dafur wurden die Daten mit Long Short-Term Memory (LSTM), Multilayer Perceptron (MLP) und T-distributed Stochastic Neighbor Embedding (t-SNE) untersucht. Die ersten beiden Techniken erzeugten Vorhersagen fur den Kundenkontakt. Diese waren allerdings nicht nutzbar, da bei der Validierung in keinem Fall ein korrektes Ergebnis vorlag. Daraufhin wurden die Daten mit t-SNE analysiert. Daraus entstandene Ergebnisse waren ungeordnet, da in den Datensätzen keine Strukturen erkannt wurden. Alle drei Verfahren ließen erkennen, dass die vorhandenen Kundendaten scheinbar zufällig angeordnet sind oder dass die notwendigen Informationen, zur Beantwortung der zu Grunde liegenden Fragen, nicht in den Datensätzen verfugbar waren. Daher war es mit den vorliegenden Kunden- und Kontaktdaten nicht möglich, den Kundenkontakt korrekt vorherzusagen.

Augmented Reality based Deep Learning for Pedestrian Detection

Srinivas Reddy Mudem

Wed, 18. 9. 2019, Room 132

Pedestrian detection is a challenging task in machine learning and a significant amount of research is going on this. Over the years, many researchers have proposed a wide variety of solutions to enhance the performance of pedestrian detection. But the majority of these approaches relies on a large amount of dataset. These approaches demand manually labeled training data to train a pedestrian classifier. However, collecting and labeling images from the real world is time consuming and expensive. To overcome such issues of collecting and manually labeling real image training data, an efficient approach is presented in this thesis work. This thesis investigates the possibilities to generate augmented data with automatic labeling. Instead of considering the whole environment for augmentation, this thesis work proposes an alternative paradigm which considers only an interesting object for augmentation. These augmented objects are merged with real images. Creating only interesting objects instead of the whole environment makes the generation process easy, cost-effective, and also it keeps the data more real. This work presents an efficient way of augmenting images with synthetic objects. A complete pipeline to generate automatic training data with the label will be presented. In addition, training of augmented dataset is demonstrated using state of the art deep learning networks and tested on different types of real datasets to find out the flaws of trained models. Our experiments and results illustrate that the model trained on synthetic data generalize better than real data, which is having limited annotation. Through a wide range of experiments, we conclude that the right set of parameters to generate the augmented dataset enhance the performance of the network.

Laufzeit und Präzisionsanalyse von Künstlichen Neuronalen Netzen in einer Predictive Maintenance Umgebung

Justin Colditz

Wed, 11. 9. 2019, Room 336

Das Ziel dieser Arbeit ist es, zu verstehen welchen Einfluss die verschiedenen Parameter von Künstlichen neuronalen Netzen (KNN) und deren Datengrundlage auf die Dauer des Lernens und das endgültige Ergebnis haben. Dabei wird sowohl auf die einzelnen Freiheitsgrade als auch auf deren Zusammenhang untereinander eingegangen. Die zugrunde liegenden Daten beziehen sich auf die maschinellen Ausfälle eines Reifenherstellers. Um die Parameter auf deren Einfluss zu überprüfen werden einige Methoden ausprobiert. Der Großteil der Arbeit besteht allerdings darin, verschiedene Werte zu testen und deren Ergebnisse miteinander zu vergleichen. Anhand dieses Vergleiches werden die Werte angepasst und erneut getestet ob sich das Netz durch die Anpassung verbessert hat. Die Ergebnisse zeigen, dass sich vieles bei KNN durch Tests und Ausprobieren verbessern lässt. Es gibt selten die eine Methode, welche direkt einen genauen Wert liefert. Wenn überhaupt können einige Parameter maximal auf bestimmte Bereiche eingeschränkt werden. Anhand dieser Bereiche, kann dann mithilfe von weiteren Tests herausgefunden werden, welcher Wert sich am besten eignet. Künstliche neuronale Netze sind viel zu speziell, um allgemeine Aussagen treffen zu können. Jedes Netz ist eigen und benötigt verschiedene Einstellungen. Des Weiteren haben die Daten, welche zum Trainieren verwendet werden, einen weitaus höheren Einfluss als erwartet wurde. Es ist daher empfehlenswert sich mit den Daten und dem gewünschten Aufbau des Netzes ausgiebig auseinander zu setzen bevor mit dem Training begonnen wird, um bestmögliche Ergebnisse für jedes Problem zu ermöglichen.

Revealing the impairments of thalamic lesions using Bayesian adaptive direct search for a neurocomputational model of saccadic suppression of displacement

Adrian Kossmann

Mon, 26. 8. 2019, Room 132

In recent years a series of studies were published emphasizing the role of the thalamus in visual stability across saccades (Cavanaugh et al., 2016; Ostendorf et al., 2013; Tanaka, 2005). It is believed that visual stability across saccades is established by a corollary discharge signal which is present in the thalamus and a proprioceptive eye position signal (Wurtz, 2018). Using a neurocomputational model for perisaccadic vision (Ziesche & Hamker, 2011) the impact of thalamic lesions on corollary discharge and proprioception was investigated. Model parameters were tuned to psychometric functions of subjects with focal thalamic lesions that participated in saccadic suppression of displacement experiments (Ostendorf et al., 2013) using Bayesian adaptive direct search (Acerbi & Ma, 2017). By applying a cluster analysis on the fitted model parameters a single-case analysis was conducted and parameters impaired by thalamic lesions were revealed. After giving a brief background overview and introducing the used methods, results will be presented and discussed.

Precise feature gain tuning in visual search tasks - the basal ganglia makes it possible: A neuro-computational study

Oliver Maith

Mon, 26. 8. 2019, Room 132

In visual search tasks a target item should be found among several distractors. A common theory is that when looking for a known target, the visual attention is directed to the features of the target, thus enhancing the response to the target in the visual system. In a study by Navalpakkam and Itti (2007) it was shown that subjects in a visual search task, in which the target was very similar to the distractors, did not learn to direct the attention directly to the features of the target, as this would also enhance the response to the distractors. In this work, the visual search task of Navalpakkam and Itti (2007) was performed with a biologically plausible neural model. For this a model of the visual system (Beuth, 2017) was combined with a model of the basal ganglia (Villagrasa, Baladron, Hamker, 2016). In this combined model, the basal ganglia receive information about the visual input through the inferior temporal cortex (IT) and control the activity of the prefrontal cortex (PFC). Activity in the PFC causes feature-based attention in the visual model. The combined model can select certain items of the visual input through feature-based and spatial attention and learns to direct the feature-based attention to certain features due to reward. The findings of Navalpakkam and Itti (2007) could be replicated with the model. The model thus demonstrates a very interesting, so far poorly investigated function of the basal ganglia, which could also explain other findings about the effects of reward on attention.

Introduction to selected derivative-free optimization algorithms

Sebastian Neubert

Mon, 5. 8. 2019, Room 131

Derivatives, or its adaptation called gradient descent, is all over the place in different fields of machine learning, especially in neural networks when using backpropagation to optimize a NN. In most cases the calculation of such gradients is computationally very expensive or not even possible. As an alternative to the state-of-the-art optimization algorithms based on derivatives, I would like to give a brief overview about optimization algorithms which do not depend on any form of a derivative and are therefore called derivative-free optimization algorithms. In this talk the focus will be put on evolutionary strategies, especially on NEAT (NeuroEvolution of Augmenting Topologies) with an outlook on Particle Swarm Optimization later on as a second example of a derivative-free optimization algorithm.

Role of the Recurrent Architecture of the Cerebellum: a Computational Study of Forward Models using Recurrent Dynamics and Perturbation Learning

Katharina Schmid

Wed, 31. 7. 2019, Room 132

The cerebellum is generally believed to be involved in motor learning and motor control. Many tasks in this domain require precise timing and coordinated activity of different body parts. Recent computational models have turned to reservoir computing in order to describe how the cerebellum encodes spatiotemporal information and adapts to the requirements of different tasks. The present study extends this work by using perturbation learning as a more biologically plausible learning mechanism. A network consisting of a recurrent reservoir and feedforward output layers is trained to model the forward dynamics of a 2-joint planar arm and to generate adaptively timed responses to different inputs. These simplified versions of tasks that may be attributed to the cerebellum illustrate the network?s ability to learn using biologically plausible learning mechanisms.

Acceptance of Occupational E-Mental Health - A cross-sectional study within the Third German Sociomedical Panel of Employees, GSPE-III

Julian Thukral

Tue, 9. 7. 2019, Room 131

Mental disorders have become one of the leading causes for workplace absence and long-term work incapacity. Occupational e-mental health (OEMH) tools such as internet-and mobile-based interventions (IMI) provide a promising addition to regular health care. Yet, acceptance of e-mental health is low. Goal of the study was to identify drivers and barriers of acceptance of OEMH. Cross-sectional data of N=1829 participants were collected from the first and second wave of the Third Sociomedical Panel of Employees ? GSPE-III, including a self-designed survey. Participants consisted of employees with high risk of early retirement due to health reasons. The Unified Theory of Acceptance and Use of Technology (UTAUT) was modified and extended to research drivers and barriers of acceptance. 86.3% of the participants (n=1579) indicated low to moderate acceptance of IMIs, while n=193 (10.6%) indicated high acceptance (M= 2.07, SD= 1.04, Range= 1-5). Path analysis confirmed that the UTAUT predictors performance expectancy (beta= .47, p= <.001), effort expectancy (beta= .20, p= <.001), and social influence (beta= .26, p= <.001) significantly predicted acceptance. Strongest extended predictors of acceptance identified were health related internet use (Total: beta= .29, p= <.001), previous experience with e-mental health (Total: beta= .24, p= <.001), and gender (Total: beta= -.11, p= <.001). Acceptance of OEMH in the sample can be considered as low. UTAUT predictors were found to significantly predict acceptance in the OEMH setting and new extended predictors were identified. To improve implementation of OEMH within regular health care, performance expectancy, effort expectancy, health related internet use, and experience must be facilitated.

Hierarchical representations of actions in multiple basal ganglia loops part II: Computational benefits and comparison to other models

Javier Baladron

Mon, 8. 7. 2019, Room 131

In this talk I will continue with the analysis of the hierarchical model I presented in my last seminar. I will initially remind you of the principal characteristics of the model and then move to new results which show the computational benefits of the hierarchical organization we are proposing. In our new experiments the model was able to transfer information across tasks. I will then compare our approach to other neuro-computational models which also target the organization of the multiple basal ganglia loops. Finally I will discuss how the model relates to reinforcement learning and how it may be extended.

Chaotic neural networks: FORCE learning and Lyapunov spectra

Janis Goldschmidt

Mon, 17. 6. 2019, Room 132

The FORCE learning algorithm by Sussillo and Abbott renewed interest in reservoir computing and using dynamic systems for machine learning purposes. This talk will give a short introduction to the FORCE learning algorithm, to the chaotic neural nets it is applied to, and how to probe their dynamic structure in the hopes to understand their computational ability.

Erkennung des Diskomforterlebens des Fahrers beim hochautomatisierten Fahren anhand von Blick- und Fahrdaten mit Maschinellen Lernverfahren

Roghayyeh Assarzadeh

Wed, 12. 6. 2019, Room 132

Mit dem Durchbruch der Fahrautomatisierung gewinnen Fragen der Mensch-Maschine-Interaktion (HMI), wie der Komfort beim automatisierten Fahren, zunehmend an Aufmerksamkeit. In diesem Zusammenhang zielt das Forschungsprojekt KomfoPilot an der Technischen Universität Chemnitz darauf ab, Diskomfort in einem automatisierten Fahrzeug anhand von physiologischen, Umwelt- und Fahrzeugparametern verschiedener Sensoren zu bewerten. Der Fahrsimulator erzeugt umfangreiche Datenmengen, die für die Erkennung von Diskomfort wertvoll sind. Die Suche nach einem Ansatz, der eine zuverlässige Erkennung von Diskomfort, verursacht durch ein hochautomatisiertes Fahrzeug, ermöglicht, ist sehr komplex. Fortschritte in der künstlichen Intelligenz (KI) führen zu einer schnellen Lösung von Problemen mit hoher Komplexität. KI hat Menschen bei der Berechnung komplexer Aufgaben, wie der Erkennung von Ermüdungserscheinungen eines Fahrers im Fahrzeug, der Erkennung von Diskomfort im automatisierten Fahrzeug, usw. übertroffen. Neuronale Netze bieten die Möglichkeit, diese Herausforderungen zu bewältigen und mithilfe von Fahrdaten des Simulators Diskomfort zu erkennen. Darüber hinaus werden die spezifischen Szenarien zur Erkennung von Diskomfort im Simulator berücksichtigt. Dieses Projekt hat zwei große Herausforderungen: Erstens, die besten Merkmale finden, um mithilfe neuronaler Netze Diskomfort zu erkennen. Zweitens, finden des besten Modells, das mithilfe der ermittelten Merkmale Diskomfort erkennen kann. Die Ergebnisse diese Projekts wurden in zwei Phasen bewertet: In der ersten Phase wurden MLP- und LSTM-Modelle zur Diagnose von Diskomfort vorgestellt und die Ergebnisse verglichen. In der zweiten Phase wird unter Verwendung des LSTM-Modells eine neue Architektur namens Kaskade vorgestellt. Um die Effizienz der neuen Architektur zu demonstrieren, werden zwei Untermodelle betrachtet: das erste für die Diagnose des Komforts und das zweite für die Diagnose des Diskomfort. Schließlich zeigen die Ergebnisse, dass die Kaskadenarchitektur die Erkennung von Diskomfort im Vergleich zur vorherigen Phase signifikant verbessert.

Exploring biologically-inspired mechanisms of humanoid robots to generate complex behaviors dealing with perturbations during walking

Hoa Tran Duy

Mon, 3. 6. 2019, Room 131

Humanoid robots are getting fluent in an interactive joint human-robot environment to assist people in need, e.g., home assistance, where robots will need to deal with even rougher terrain and more complicated manipulation tasks under significant uncertainty either for static or dynamically changing environments. Researches on humanoid robots have focused mainly on reliable perception, planning, and control methods for completing challenging tasks. Studies also show that, legged robots may inevitable fall over during a task in an unstructured environment, and it may end up with a serious damage for itself or the environment. While, the ideal robot, like a human, in a real-world scenario should be able to recover, stand up, and complete the task. In this project I introduce a new approach for humanoid robotics push recovery, provide robots the ability to recover from external pushes in real world scenarios. The study involves detecting falls, selecting and performing the appropriate actions to prevent a damage both on the robot and the environment around during its locomotion. Advances in studies of animal and human movements along with robotics, artificial intelligence and neural network researches are combined to develop a new paradigm in creating robust and flexible recovery behaviors. This project involves topics such as central pattern generator, robotics biped walking, unsupervised and reinforcement learning to develop a robot recovery approach inspired by biological systems.

Investigating robustness of End-to-End Learning for self-driving cars using Adversarial attacks

Vivek Bakul Maru

Mon, 27. 5. 2019, Room 131

Deep neural networks have been recently performing on high accuracy in many important tasks, most notably on image-based tasks. Current state-of-the-art image classifiers are performing at human level. Having said that, Deep learning has been successfully applied to End-to-End learning architectures for autonomous driving tasks where a deep neural network maps the pixel values coming from camera to a specific steering angle of the car. However, these networks are not robust enough to carefully perturbated image inputs known as Adversarial inputs. These inputs can severally decrease the accuracy and performance of the network which results in endangering the systems where such deep learning models are deployed. This thesis work mainly focuses on proving the vulnerability of End-to-End architectures for self-driving car towards adversarial attacks. In this work, we formulize the space of adversaries against End-to-End learning and introduce a way to generate Universal Adversarial Perturbation for these architectures. We have generated universal perturbations using distinct subsets of dataset and tested against two different models i.e. classification and regression in a White-box setting. To show the transferability and universality of these perturbations, we have tested these perturbations against a third constrained model in a Black-box setting. With the results, we showed that End-to-End architectures are highly vulnerable to perturbations and these perturbations can generate a serious threat to the system when deployed in a self-driving car. We also propose a theoretical prevention method for the End-to-End architecture against Universal perturbations to increase the security of the model.

Successor representations

Julien Vitay

Mon, 20. 5. 2019, Room 131

Successor representations (SR) are a trade-off between model-free and model-based methods in reinforcement learning. They see a revival for a couple of years through the work of Samuel Gershman and colleagues who propose a neurobiological mapping of this algorithm to the prefrontal cortex, hippocampus and basal ganglia. This seminar will explain the principle of SR, present their main neuroscientific predictions and discuss their plausibility. See https://julien-vitay.net/2019/05/successor-representations

Development and analysis of active SLAM algorithms for mobile robots

Harikrishnan Vijayakumar

Mon, 13. 5. 2019, Room 375

Mobile robots are becoming an essential part of industrial and household services. In this thesis, we tackle the problem of autonomous exploration using active SLAM concepts. Autonomous exploration is the task of creating a model of the environment by the robot without human intervention. The environment model creation capabilities of the robot are highly dependent on sensor and actuator performance. We require an algorithmic approach to handle the noise from the hardware measurements. We need a combined solution for localization, mapping, and motion planning to improve autonomous exploration performance. In its classical formulation, SLAM is passive which means that the algorithm creates the map and localizes the robot passively by using received motion and sensor observations. It takes no part in deciding where to take measurements next. The quality of the map created by the SLAM depends on the correctness of position estimation of the robot while mapping. To further improve the quality of the map, it is necessary to integrate path planning with SLAM so that the algorithm can decide the future motion of the robot based on the current localization and map information for achieving better localization and a higher quality map. This thesis addresses this problem. There are many approaches that do not address the uncertainty in motion and sensor measurements. We propose a probabilistic approach that considers the current system uncertainty and plans the path of the robot that minimizes the robot?s pose and map uncertainty. This thesis first examines the background and other approaches addressing active SLAM. We implement and evaluate state of the art approaches on our own, standardized, datasets. We propose and implement a new approach based on Schirmer on path planning in belief space. Path planning in configuration space assumes accurate knowledge of the robot position, and the goal is to find a path from the position of the robot to the destination. In reality, uncertainties in movement and sensing sometimes lead to the inability to detect the accurate position of the robot. Planning in belief space maintains a probability distribution over possible states of the robot and takes these uncertainties into account. Path planning in belief space entails planning a path for the robot given the knowledge about the environment and the current belief. We compute paths in belief space by computing two representations of the environment in parallel. The first one is a classical geometric map of the environment, the other one encodes how well the robot can localize at different parts of the environment. We then apply a search algorithm to the discrete configuration space to find the best path to the target goal location with the lowest uncertainty. Finally, we define a utility function for each target goal based on its mean uncertainty and navigation cost to evaluate the best candidate for the next motion. At the end of the work, we discuss in detail the results, strengths, weaknesses, assumptions, and identify areas for future research.

Optimizing topic models on small corpora - A qualitative comparison between LDA and LDA2Vec

Anton Schädlich

Thu, 9. 5. 2019, Room 367

Topic modeling become a subfield in unsupervised learning since the invention Latent Dirichlet Allocation (LDA) algorithm. However, on smaller corpora it sometimes does not perform well in generating sufficiently coherent topics. In order to boost coherence scores, it has been extended in various researches with vector space word embeddings. The LDA2Vec algorithm is one of these symbiotic algorithms that draws context out of the word vectors and the training corpus. This presentation is about the qualitative comparison of the topics and models of optimized LDA and the LDA2Vec algorithm trained on a small corpus of 1800 German language documents with a considerably small amount of topics. The coherences were measured both on the 'combined coherence framework' and on manual observation.

Multi-GPU simulation of spiking neural networks on the neural simulator ANNarchy

Joseph Gussev

Mon, 6. 5. 2019, Room 367

Because of their highly parallel architecture and availability, GPUs have become very present in neural simulators. With the goal of outperforming CPU-based clusters with cheaper GPUs, some simulators have started to add multi-GPU support and have shown speedups compared to single-GPU implementations. The thesis behind this presentation found its goal in adding multi-GPU implementations to ANNarchy's spiking neural network simulation. After a brief background overview, the ideas behind the implemented algorithms will be presented, their performance and scalability evaluated and discussed.

A model car localization using Adaptive Monte Carlo Localization and Extended Kalman Filter

Prafull Mohite

Tue, 23. 4. 2019, Room 336

Localization works with the help of GPS. But this might not be very accurate specially in case of autonomous driving. In case of Autonomous driving, It is a key step for deciding what should be next step. As localization is a key concept to predict the next step, it should be made more accurate and reliable and this can be done using Exteroception sensors. The range of Lidar is high enough with minimum error rate, is been used for depth perception. In this thesis, a very popular technique, Simultaneous Localization and mapping (SLAM) approach is selected. Map is needed for localization and localization cannot be completed without map giving rise to chicken egg problem. Mapping the environment before localization is in common practice and we have followed that. The aim of thesis is to tryout the performance of the proposed system against the state of the art. The proposed system consist of hybrid localization consist of 3 times pose correction with non Gaussian filter and Nonparametric filters. As a result, location in Cartesian coordinate system and orientation is observed.

Investigation of Model-Based Augmentation of Model-Free Reinforcement Learning Algorithms

Oliver Lange

Mon, 15. 4. 2019, Room 131

Reinforcement learning has been successfully applied to a range of challenging problems and has recently been extended to handle large neuronal network policies and value functions. However, the sample complexity of model-free algorithms tends to limit their applicability to physical systems. The reduction of sample complexity for deep reinforcement learning algorithms is essential to make them applicable in complex environments. This presentation is about the investigation of the performance of a reinforcement learning agent which utilizes the advantages of both model-free and model-based reinforcement learning algorithms.

Interpreting deep neural network-based models for automotive diagnostics

Ria Armitha

Wed, 10. 4. 2019, Room 336

With the breakthrough of Artificial intelligence over the last few decades and extensive improvements in Deep Learning methodologies, the field of Deep Learning has gone through major changes. AI has outdone humans in computing complex tasks like object and image recognition, fault detection in vehicles, speech recognition, medical diagnosis etc. From a bird's-eye view the models are basically algorithms which try to learn concealed patterns and relationships from the data fed into it without any fixed rules or instructions. Although these models' prediction accuracies may be impressive, the system as a whole is a black-box (non-transparent). Hence, explaining the working of a model to the real world poses its own set of challenges. This work deals with interpreting vehicle fault-detection model. Current fault detection approaches rely on model-based or rule-based systems. With an increasing complexity of vehicles and their sub systems, these approaches will reach their limits in detecting fault root causes in highly connected and complex systems. Furthermore, current vehicles produce rich amounts of data valuable for fault detection which cannot be considered by current approaches. Deep Neural Networks (DNN) offer great capabilities to tackle these challenges and automatically train fault detection models using in-vehicle data. However, fault detection models based on DNNs (here, CNNs and LSTMs) are black boxes so it is nearly impossible to back-trace their outputs. Therefore, the aim of this work is to identify, implement and evaluate available approaches to interpret decisions made by DNNs applied in vehicle diagnostics. With that, decisions made by the DNN diagnostics model can be better understood to (i) comprehend the model's outputs and thus increase model performance as well as (ii) enhance their acceptability in vehicle development domain.

Extrinsic Camera Pose Estimation Using Existing Parallel Lines of the Surrounding Area

Layth Hadeed

Mon, 8. 4. 2019, Room 131

Camera pose estimation is an essential process for computer vision systems used by intelligent automotive systems. This work is concerned with the pose estimation of a train camera that is installed in the front windshield of the train. The pose is estimated by using only the rails and sleepers of the railway track without any additional calibration objects. The information extracted from the rail lines in the image helped estimate the pitch and yaw by determining the vanishing point in the world X-axis direction. The orthogonality between the rails and sleepers helped estimate the roll by using the orientation of the sleepers in the track, it was possible to estimate the value of the role. The translation parameters are calculated from the rotation matrix and the normal vectors of the projective planes between the camera center and the rails. The results show that a good approximation of the pose can be made by adding an additional step that compensates for the error in the estimated roll.

Weight estimation using sensory soft pneumatic gripper

Hardik Sagani

Mon, 25. 3. 2019, Room 132

Current soft pneumatic grippers might robustly grasp flat and flexible objects with curved surfaces without distorting them, but not estimate the properties by manipulating it to a certain extent. On the other hand, it is difficult to actively deform a complex gripper to pick up surface objects. Hence, this research project represents a prototype design of soft sensory pneumatic gripper utilized to estimate the weight of objects. An easy to implement soft gripper is proposed by soft silicon material with four fingers. It can be fabricated by placing piezoresistive material sandwiched between soft silicon rubber and conducting threads. Sixteen Pressure sensors and 4 curvature sensors of velostat piezoresistive material have been placed into the hand. The layers were designed with the help of conductive thread electrode material. Hence, the sensory gripper can evaluate the weight property of grasped objects.

Neuro-computationale Modellierung der Basal Ganglien für Stopp-Signal Aufgaben

Iliana Koulouri

Tue, 12. 3. 2019, Room 132

Baladron, Nambu und Hamker haben 2017 ein biologisch motiviertes neuro-computationales Basalganglienmodell für die Aktionsauswahl während einer Erkundungsphase nach Änderung von Umweltkonditionen durch die GPe-STN Schleife entwickelt. Die Basalganglien beeinflussen ebenso die Unterdrückung von Handlungen. Es wird angenommen, dass ein Rennen zwischen Go-Signalen und Stopp-Signalen einem Rennen zwischen den neuronalen Bahnen der BG entspricht. Dazu postulierten Schmidt et al. (2013) und Mallet et al. (2016) ein zweistufiges Abbruchmodell, welches einen Pause- und einen Abbruchmechanismus beinhaltet. Der Abbruch einer Handlung werde durch einen neu entdeckten Zelltyp, den arkypallidalen globus pallidus Zellen, realisiert. Über ihre Vernetzung sind bislang nicht alle Einzelheiten bekannt. Die vorliegende Arbeit befasst sich mit der Erweiterung des BG Modells von Baladron et al. (2017) um die Implementierung dieses postulierten zweistufigen Abbruchmodells, der Replizierung der Ergebnisse von Mallet et al. (2016) und der Untersuchung des Einflusses der Stärke der Inputsignale, sowie ihre Verzögerung auf das Modell. Es zeigt sich, dass das erweiterte Modell die Ergebnisse von Mallet al al. (2016) erfolgreich replizieren kann. Dabei scheint auch der Kortex als Quelle der Erregung der arkypallidalen globus pallidus Zellen als plausibel. Das Modell weist einen geringen Einfluss gegenüber Änderungen der Stoppraten und der zeitlichen Verzögerung zwischen Go- und Stopp-signalen auf. Allerdings erweist es sich als empfindlich gegenüber Änderungen der Goraten.

Modelling and controlling of the soft pneumatic actuator for the Elbow assistance

Sidhdharthkumar Vaghani

Tue, 12. 3. 2019, Room 132

More than a decade the wearable robotics is in the trend for the rehabilitation or for the assistance of human motion. The wearable robots are attracting the focus of the human because of its safe interaction with the human and external environment. Because of their stiffness and wearable property, wearable robots are very famous for the rehabilitation and assistance of human body joints e.g. knee rehabilitation, elbow or shoulder assistance etc. However the controlling of the soft robots is not an easy task compared to rigid robots. Mainly the wearable robots are using the air pressure as the activation parameter of their body. Therefore the aim of this thesis is to modelling and controlling of the exosuit, which is useful for the elbow assistance. The structure of the exosuit consists of two pneumatic actuator one is for elbow flexion movement and other is for extension movements. One IMU unit is mounted on the top of the exosuit to monitor the relative angle between two links of the exosuit. The modelling of the exosuit is carried out separately for flexion and extension actuator by considering the geometry of the actuators. For precise control of the actuator proportional valve has been developed and used along with the PID controller. The multi-layer multi-pattern central pattern generator (MLMP-CPG) is adapted to generate the rhythmic and non-rhythmic behaviour of the exosuit. To implement a better control strategy, the reflexes are introduced with the CPG model. At the end of the thesis work, the experiments are carried to prove the implementation of PID and reflex based controller using the MLMP-CPG. The reflexes with the CPG generate the adaptive behaviour of the exosuit.

Decorrelation and sparsification by homeostatic inhibitory plasticity in cooperation with voltage-based triplet STDP for receptive field formation

René Larisch

Mon, 4. 3. 2019, Room 202

Several recent models have shown how V1 simple-cell receptive fields may emerge from spike-timing dependent plasticity (STDP) mechanisms. These approaches mostly depend on normative learning paradigms, such as maximizing information transmission or optimizing sparseness, or on simplified rate-based learning dynamics. These assumptions make it unclear how inhibition influences neuronal properties (for example orientation selectivity and contrast invariance tuning curves) and network effects (for example input encoding). We here propose a model of V1 based on a phenomenological excitatory and an inhibitory plasticity rule. Our simulation results showing the interplay of excitatory and inhibitory plasticity on the connectivity structure and the influence on input encoding. Further, we demonstrate how the right amount of excitatory and inhibitory input leads to the occurrence of contrast invariance tuning curves.

Computing with hippocampal sequences

Prof. Dr. Christian Leibold

Fri, 15. 2. 2019, Room 336

Hippocampal place cells are activated in sequence on multiple time scales during active behavior as well as resting states. Reports of prewired hippocampal place cell sequences that were decoded as future behaviors are hard to bring into register with the suspected roles of the hippocampus in the formation of autobiographic episodic memories and spatial learning. Here, I propose a computational model, that shows how a set of predefined internal sequences that are linked to only a small number of salient landmarks in a large random maze can be used to construct a spatial map of a previously unknown maze.

A network model of the function and dynamics of hippocampal place-cell sequences in goal-directed behavior

Lorenz Gönner

Fri, 15. 2. 2019, Room 336

Hippocampal place-cell sequences observed during awake immobility often represent previous experience, suggesting a role in memory processes. However, recent reports of goals being overrepresented in sequential activity suggest a role in short-term planning, although a detailed understanding of the origins of hippocampal sequential activity and of its functional role is still lacking. In particular, it is unknown which mechanism could support efficient planning by generating place-cell sequences biased toward known goal locations, in an adaptive and constructive fashion. To address these questions, I propose a spiking network model of spatial learning and sequence generation as interdependent processes. Simulations show that this model explains the generation of never-experienced sequence trajectories in familiar environments and highlights their utility in flexible route planning. In addition, I report the results of a detailed comparison between simulated spike trains and experimental data, at the level of network dynamics. These results demonstrate how sequential spatial representations are shaped by the interaction between local oscillatory dynamics and external inputs.

Adaption of a neuro-computational model of space perception to new physiological data and evaluation with experimental data from human and monkeys

Nikolai Stocks

Wed, 13. 2. 2019, Room 132

A. Zietsche and F.H. Hamker developed a computational model for the simulation visual perception and perisaccadic mislocalisation [A. Ziesche et al. 2011]. .A paper by Joiner et al. published in 2013 demonstrates that monkeys do perform different than humans in Saccadic Supression of Displacement (SSD) trials under the stimulus blanking condition which the model does not account for. Furthermore does data by Xu et al. 2012 show that the neural representation of current eye position updates later than the original model accounted for. The goal of this thesis is to find adjustments to the parameters of the original model and to allow it to accurately simulate SSD experiments for both monkeys and humans using Xu's revised timeframe.

Development of Collision Prevention Module using Reinforcement Learning

Arshad Pasha

Wed, 6. 2. 2019, Room 132

The tremendous advancements in the field of application of Artificial Intelligence in Advanced Driver Assistance Systems (ADAS) and Autonomous Vehicles (AVs) have still not been able to curb the accident rate. Many automotive manufacturers aim to achieve the 'Zero Fatalities' goal with advanced collision prevention modules by the next decade. This thesis work mainly focuses on the implementation of ADAS functionalities such as Autonomous Emergency Braking (AEB), Autonomous Emergency Steering (AES), and Adaptive Cruise Control (ACC) using Reinforcement Learning (RL). RL has attracted researchers' attention in recent years due to its immense potential. Q-Learning is one of the most exploited RL algorithms. It has not only been employed to solve a simple problem as grid world problem but also to solve complex problems as playing advanced levels video games such as Doom, Breakout and various other Atari games. This thesis work also uses the Q-learning algorithm. This thesis was carried out at the Research and Development unit of ZF known as Zukunft Mobility GmbH, Kösching, Germany. The principal objective of this thesis is to use Tabular Q-learning approach to implement complex ADAS functionalities as mentioned earlier and analyze the RL agent's behavior in addition to integrating these developed algorithms within the organization's proprietary simulation environment (SimpleSpec). Further, the continuous State-Action space has been discretized to adapt to the tabular Q learning approach which avoids the 'Curse Of Dimensionality' and 'Matrix Explosion' problems. In the penultimate part, a qualitative and quantitative evaluation of the performance has been carried out. The results obtained after a full-scale model testing for low, medium and high-speed scenarios have been recorded. In addition, random scenarios were generated to test the scope and capabilities of the trained RL agent. Further research has been carried out in the area of the impact of reward function structure on the RL agent which includes the sparse and non-sparse reward structure design and their influence on the reward convergence and learning process. Further, this segment also includes the analysis between the explicitly distributed non-sparse reward design and the non-sparse inline reward design. The final part summarizes the tasks covered, the goal achieved and draws a path to the future work.

Road scene semantic segmentation using residual factorized convnet and surround view fisheye cameras

Saiffuddin Syed

Wed, 30. 1. 2019, Room 367a

Automotive industry is continuously evolving, especially in the self-driving domain which creates a demand for new concepts to be developed, implemented and tested. At present the only sensor capable of sensing the immediate surrounding of the vehicle is a camera.This thesis addresses the 360 degrees road scene semantic segmentation problem for fisheye cameras. Present vehicles are equipped with distinct types of cameras used for various practical real-time applications, the most common camera model being the wide-angle fisheye cameras which are considered for this thesis. Usage of this camera brings two major challenges: firstly, CNN-based semantic segmentation task requires a huge amount of pixel-level annotated data. So far there is no open-source annotated dataset available for wide-angle images. Secondly, a fisheye camera introduces severe distortions and negates the positional invariance offered by a conventional pinhole camera model. To overcome this, training the images on transformed images that are augmented using a fisheye filter is proposed. An approach to integrate blocks which improve the representational power of existing architectures by explicitly modelling interdependencies between channels of convolutional features, has been tested. The experiments carried out prove the effectiveness of these blocks when augmented data is used. Much of the work presented in the thesis was devoted to a rigorous comparison of the architectures.The evaluation of the thesis is done on two different kind of datasets, a real world dataset and a synthetic dataset. The primary metric used for evaluation was the Intersection-over-Union (IoU). The results at the end of the thesis showed that a large amount of existing annotated data taken from pinhole cameras can be reused through augmentation and relatively small amount of annotated from fisheye cameras is required to account for domain shift. Further, the new architectures presented in this thesis show promising results when applied to augmented data.

Categorizing facial emotion expressions with attention-driven convolutional neural networks

Valentin Forch

Mon, 21. 1. 2019, Room 219

The development of so-called deep machine learning techniques has brought new possibilities for the automatic processing of emotion-related information which can have great benefits for human-computer interaction. Vice versa machine learning can profit from concepts known from human information processing (e.g., visual attention). Being located in the spectrum of human and artificial intelligence, the aim of the present thesis was twofold: (a) to employ a classification algorithm for facial expressions of emotions in the form of a deep neural network incorporating a spatial attention mechanism on image data of facial emotion expressions and (b) to compare the output of the algorithm with results from human facial emotion recognition experiments. The results of this thesis show that such an algorithm can achieve state-of-the-art performance in a facial emotion recognition task. With regard to its visual search strategy some similarities with human saccading behavior emerged when the model's perceptive capabilities were restricted. However, there was only limited evidence for emotion-specific search strategies as can be found in humans.

A network model of the function and dynamics of hippocampal place-cell sequences in goal-directed behavior

Lorenz Gönner

Mon, 21. 1. 2019, Room 219

Hippocampal place-cell sequences observed during awake immobility often represent previous experience, suggesting a role in memory processes. However, recent reports of goals being overrepresented in sequential activity suggest a role in short-term planning, although a detailed understanding of the origins of hippocampal sequential activity and of its functional role is still lacking. In particular, it is unknown which mechanism could support efficient planning by generating place-cell sequences biased toward known goal locations, in an adaptive and constructive fashion. To address these questions, I propose a spiking network model of spatial learning and sequence generation as interdependent processes. Simulations show that this model explains the generation of never-experienced sequence trajectories in familiar environments and highlights their utility in flexible route planning. In addition, I report the results of a detailed comparison between simulated spike trains and experimental data, at the level of network dynamics. These results demonstrate how sequential spatial representations are shaped by the interaction between local oscillatory dynamics and external inputs.

3D reconstruction with consumer depth cameras

Manh Ha Hoang

Wed, 9. 1. 2019, Room 132

In this thesis, we develop an RGB-D camera-based system that is able to generate a 3D model of a single household object using a consumer depth (RGB-D) camera. The system then grabs textures of the object from a high-resolution DSLR camera and applies them to the reconstructed 3D model. Our approach specially addresses on generating a highly accurate 3D shape and recovering high-quality appearance of the object within a short time interval. The high-quality 3D texture object models can be used for the products of online shopping, augmented reality, and further research of 3D Machine Learning.

Hierarchical representations of actions in multiple basal ganglia loops

Javier Baladron

Wed, 5. 12. 2018, Room 132

I will introduce here three novel concepts, tested and evaluated by means of a neuro-computational model that brings together ideas regarding the hierarchical organization of the basal ganglia and particularly assigns a prominent role to plasticity. I will show how this model reproduces the results of two cognitive tasks used to measure the development of habitual behavior and introduce a model prediction.

Investigating reservoir-based reinforcement learning for robotic control

Oleksandr Nikulkov

Wed, 28. 11. 2018, Room 132

Reservoir Computing is a relatively novel approach for training recurrent neural networks. It is based on generating a random recurrent reservoir as a part of the network and training only the readout of the reservoir. This separation makes the setup easy to be implemented and offers different directions for further research to be done. Existing methods for learning cognitive tasks often require continuous reward signals, which are not always available in cognitive tasks. However, this disadvantage can be avoided by using supralinear amplification on the trace of node-perturbation weight updates to suppress the relaxation-effect, as proposed by (Miconi, 2017). In this presentation, I will show how such a network can be applied to a robotic control task and investigate the role of the different parameters.

Model Uncertainty estimation for a semantic segmentation network with a real time network deployment analysis on Nvidia Drive PX2 for Autonomous Vehicles

Abhishek Vivekanandan

Mon, 19. 11. 2018, Room 132

Autonomous vehicles require a high degree of perception capabilities in order to perceive the environment and predict objects therein at a high precision in real time. For such cases we use semantic segmentation networks. A major challenge in using semantic segmentation is determining how confident the network is in its prediction or in other words how trustworthy classification outcomes are. Integrating uncertainty estimates with semantic segmentation help us to understand the confidence measure with which a network predicts its output. Bayesian approaches along with dropouts provide us the necessary tool in deep learning to extract the uncertainty involved in the prediction from a model. In Bayesian Neural Networks, we place a distribution over the weights, giving us a probabilistic interpretation about the classification. For such networks, multiple Monte Carlo sampling is needed to generate a reliable posterior distribution from which we can infer uncertainty statistics. The serial nature of this sampling approach restricts its use in the real time environment. In this work through in-depth analysis we show the best possible places in a neural network to deploy dropouts along with the number of MC sampling which needs to be done such that we can maximize the quantifications to estimate uncertainty. We also exploit parallel capabilities of GPU to realize certain neural operations such as convolution and dropouts directly on an embedded hardware with minimal abstraction. As a result we propose the necessary alternative changes to the kernel functions needed to implement parallel Monte Carlo dropout sampling to estimate uncertainty in real-time. Finally, we provide a brief comparison in terms of benchmarking about the kernel implementations on a CPU (Intel Xeon processor) and a GPU (DrivePX2 and Nvidia Geforce 1080Ti).

Disentangling representations of grouped observations in adversarial autoencoders

Felix Pfeiffer

Wed, 14. 11. 2018, Room 131

Being able to classify the shown emotion or facial action from mere pictures of faces is a challenging task in machine learning, since simple classification requires at least reliably labeled data, which is hard to get in sufficient quantity. Unsupervised learning methods can at least in part avoid the problem of dependency from such data, by finding representations that are meaningful. In my thesis I present an algorithm that teaches an Adversarial Autoencoder how to find representations of data. With clever administration of the training process it is possible to strip information from the representation that would not be beneficial for specific tasks like classification. This process is called disentangling and the administrative strategy is to find groups of data. I will show the results of some experiments that verify that the algorithm does what it promises and elaborate on where its weaknesses may be, by training an Adversarial Autoencoder on a colorful MNIST dataset and let it produce disentangled representations that separate style from content.

Interpreting deep neural network-based models for automotive diagnostics

Ria Armitha

Wed, 7. 11. 2018, Room 131

With the breakthrough of Artificial intelligence over the last few decades and extensive improvements in Deep Learning methodologies, the field of Deep Learning has gone through major changes. AI has outdone humans in computing complex tasks like object and image recognition, fault detection in vehicles, speech recognition, medical diagnosis etc. From a bird's-eye view the models are basically algorithms which try to learn concealed patterns and relationships from the data fed into it without any fixed rules or instructions. Although these models' prediction accuracies may be impressive, the system as a whole is a black-box (non-transparent). Hence, explaining the working of a model to the real world poses its own set of challenges. This work deals with interpreting vehicle fault-detection model. Current fault detection approaches rely on model-based or rule-based systems. With an increasing complexity of vehicles and their sub systems, these approaches will reach their limits in detecting fault root causes in highly connected and complex systems. Furthermore, current vehicles produce rich amounts of data valuable for fault detection which cannot be considered by current approaches. Deep Neural Networks (DNN) offer great capabilities to tackle these challenges and automatically train fault detection models using in-vehicle data. However, fault detection models based on DNNs (here, CNNs and LSTMs) are black boxes so it is nearly impossible to back-trace their outputs. Therefore, the aim of this work is to identify, implement and evaluate available approaches to interpret decisions made by DNNs applied in vehicle diagnostics. With that, decisions made by the DNN diagnostics model can be better understood to (i) comprehend the model's outputs and thus increase model performance as well as (ii) enhance their acceptability in vehicle development domain.

Learning the Motor Program of a Central Pattern Generator for Humanoid Robot Drawing

Deepanshu Makkar

Thu, 1. 11. 2018, Room 132

In this research project, we present a framework where a humanoid robot, NAO, acquires the parameter of a motor program in a task of drawing arcs in Cartesian space. A computational model based on Central Pattern Generator is used. For the purpose of drawing a scene, geometrical features such as arcs are extracted from images using Computer Vision algorithms. The algorithm used in the project which considers only important features for the purpose of robot drawing is discussed. These arcs can be described as a feature vector. A discussion is done on how genetic algorithms help us in parameter estimation for the motor representation for selected feature vector. This understanding of parameters is used further to generalize the acquired motor representation on the workspace. In order to have a generalization for achieving a mapping between the feature vector and the motor program, we propose an approximation function using a multilayer perceptron (MLP). Once the network is trained, we present different scenarios to the robot and it draws the sketches. It is worth noting that our proposed model generalizes the motor features for a set of joint configuration, unlike the traditional way of robots drawing by connecting intermediate points using inverse kinematics.

Cortical routines - from experimental data to neuromorphic brain-like computation

Prof. Dr. Heiko Neumann (Ulm University, Inst. of Neural Information Processing)

Tue, 30. 10. 2018, Room 1/336

A fundamental task of sensory processing is to group feature items that form a perceptual unit, e.g., shapes or objects, and to segregate them from other objects and the background. In the talk a conceptual framework is provided, which explains how perceptual grouping at early as well as higher-level cognitive stages may be implemented in cortex. Different grouping mechanisms are implemented which are attuned to basic features and feature combinations and evaluated along the forward sweep of stimulus processing. More complex combinations of items require integration of contextual information along horizontal and feedback connections to bind neurons in distributed representations via top-down response enhancement. The modulatory influence generated by such flexible dynamic grouping and prediction mechanisms is time-consuming and is primarily sequentially organized. The coordinated action of feedforward, feedback, and lateral processing motivates the view that sensory information, such as visual and auditory features, is efficiently combined and evaluated within a multiscale cognitive blackboard architecture. This architecture provides a framework to explain form and motion detection and integration, higher-order processing of articulated motion, as well as scene segmentation and figure-ground segregation of spatio-temporal inputs which are labelled by enhanced neuronal responses. In addition to the activation dynamics in the model framework, steps are demonstrated how unsupervised learning mechanisms can be incorporated to automatically build early- and mid-level visual representations. Finally, it is demonstrated that the canonical circuit architecture can be mapped onto neuromorphic chip technology facilitating low-energy non-von Neumann computation.

Neural Reflexive Controller for Humanoid Robots Walking

Rishabh Khari

Thu, 25. 10. 2018, Room 131

For nearly three decades, a great amount of research emphasis has been given in the study of robotic locomotion, where researchers, in particular, have focused on solving the problem of locomotion control for multi-legged humanoid robots. Especially, the task of imitating human walking has since been the most challenging one, as bi-pedal humanoid robots implicitly experience instability and tend to topple itself over. However, recently new machine learning algorithms have been approached to replicate the sturdy, dexterous and energy-efficient human walking. Interestingly many researchers have also proposed that the locomotion principles, although run on a centralized mechanism (central pattern generator) in conjunction with sensory feedback, they can also independently run on a purely localized sensory-feedback mechanism. Therefore, this thesis aims at designing and evaluating two simple reflex-based neural controllers, where the first controller generates a locomotion pattern for the humanoid robot by combining the sensory feedback pathways of the ground and joint sensors to the motor neuron outputs of the leg joints. The second controller makes use of the Hebb's learning rule by first deriving locomotion patterns from the MLMP-CPG controller while observing the sensory feedback simultaneously and finally generating motor-neuron outputs associatively. In the end, this thesis also proposes a fast switching principle where the output to motorneurons after a certain interval is swiftly transferred from the MLMP-CPG to the associative reflex controller. This is implemented to observe adaptive behavior present for centralized locomotor systems.

Improving autoregressive deep generative models for natural speech synthesis

Ferin Thunduparambil Philipose

Wed, 24. 10. 2018, Room 132

Speech Synthesis or Text To Speech (TTS) synthesis is a domain that has been of research interest for several decades. A workable TTS system would essentially generate speech from textual input. The quality of this synthesized speech would be gauged based on how similar it sounds to the human voice and the ease of understanding it clearly. .A fully end to end neural Text-To-Speech system has been set up and improved upon, with the help of WaveNet and Tacotron deep generative models. The Tacotron network acts as a feature prediction network that outputs the log-mel spectrograms, which are in-turn utilized by WaveNet as the local conditioning features. Audio quality was improved by the logmel local conditioning and the fine-tuning of hyper-parameters such as mini-batch size & learning rate. Computational effort was reduced by compressing the WaveNet network architecture.

Fatigue detection using RNN and transfer learning

Azmi Ara

Wed, 24. 10. 2018, Room 132

Driving car is a insecure activity which requires full attention. Any distraction can lead to dangerous consequences, such as accidents. While driving, many factors are involved, such as: fatigue, drowsiness, distractions. Drowsiness is a state between alert and sleep. For this reason, it is important to detect drowsiness in advance which will help in protecting the people from accidents. The research guides us to understand an implicit and efficient approach to detect the different levels of drowsiness. Every driver has different driving patterns. The developed system should be able to adopt to the changes of driver?s behavior. The aim of this thesis is to contribute to the study of detecting drivers drowsiness levels while driving through different approaches which integrates of two sensory data to improve detection performance.

Car localization in known environments

Prafull Mohite

Tue, 2. 10. 2018, Room 131

Localization in a broader sense is very wide topic and at present basic localization takes place with the help of GPS sensor but lacks accuracy which is important for Autonomous driving. To overcome this problem, there are different environmental sensors used (typically, Sonar, Lidar, Camera). Lidar sensor being very accurate in case of depth perception is the used. In this thesis, Simultaneous Localization And Mapping (SLAM) approach is selected. SLAM, as name suggested Localization and mapping is chicken egg problem and to solve it, we are creating map of an environment before performing localization. For mapping, Gmapping and for localization within map, Adaptive Monte Carlo Localization (AMCL) is selected. AMCL is basically a particle filter. After giving a map of an environment, the algorithm estimates the position and orientation of a car as it moves and senses the environment.

Training approaches onsemantic segementation using transfer learning, dataset quality assessment and intelligent data augmentation

Mohamed Riyazudeen Puliadi Baghdad

Mon, 24. 9. 2018, Room 131

Data Sparsity is one of the key problems that automotive industries face today. One way to overcome this is to use synthetic data that are generated from graphics engines or virtual world generator, that can be leveraged to train neural networks and accomplish tasks such as autonomous driving. The features learned from synthetic data yield better performance with a suitable training approach and some real data. The number of images in the synthetic dataset, and its similarity to real world dataset play a major role in transferring the learned features effectively across domains. This similarity in the distribution of these datasets was achieved through different approaches, the most effective one being Joint Adaptation Network Approach. Also, data augmentation in a smart way could boost the performance achieved. Intelligent data augmentation was achieved using conditional Generative Adversarial Networks and Color Augmentation technique. With the findings of this research work, a possible solution for tackling data sparsity problem was achieved.

Image anonymization using GANs

Thangapavithraa Balaji

Mon, 24. 9. 2018, Room 131

Millions of images are being collected every day for applications to enable scene understanding, decision making, resource allocation and policing to ease the human life. Most of these applications doesn't require the identity of the people in the images.There is an increasing concern in these systems invading the privacy of the users and the public. On one side, the camera/robots can assist a lot in everyday life, but on the other side, the privacy of the user or the public should not be compromised. In this master thesis, a novel approach was implemented to anoymize faces in the datasets which enable privacy protection of the individuals in the datasets. The Generative Adversarial Network(GAN) approach was extended and the loss function was formulated in a combined fashion. The performance of conventional image anonymization techniques like blurring, cropping, pixelating were compared against GAN generated images using autonomous driving applications like object detection and semantic segmentation.

Investigating Model-based Reinforcement Learning Algorithms for Continuous Robotic Control

Frank Witscher

Wed, 19. 9. 2018, Room 368

Obwohl model-free, deep Reinforcement Learning eine immer größer werdende Bandbreite an Aufgaben erfüllen kann, leiden die jeweiligen Algorithmen an einer großen Ineffizienz bezüglich der dafür erforderlichen Datenmenge. Model-based Reinforcement Learning, welches ein Dynamics Model der Umwelt erlernt, verspricht hierbei Abhilfe. Jüngste Forschungen kombinieren model-free Algorithmen mit model-based Ansätzen, um die Stärken beider Reinforcement Learning-Zweige auszunutzen. In meiner Verteidigung gebe ich eine Einleitung in model-based Reinforcement Learning und einen Überblick über die mögliche Nutzung von Dynamics Models, wie sie in neusten Publikationen zu finden ist. Wir konzentrieren uns dabei auf Umgebungen mit kontinuierlichen Action Spaces, wie sie in der Robotik anzutreffen sind. Temporal Difference Model ist ein solcher Hybrid aus model-free Learning mit model-based Control. Dieser wird im Detail vorgestellt und ausgewertet.

Sensor simulation and Depth map prediction on Automotive Fisheye camera using automotive deep learning

Deepika Gangaiah Prema

Wed, 12. 9. 2018, Room 131

The aim is to create a synthetic 3D environment which enables to obtain a supervised dataset using Unity framework and simulating different sensors like lidar and fisheye camera in the simulation environment. This dataset will be used to develop, test and validate different machine learning algorithms for automotive use cases. The big advantage of the simulation environment is the possibility to generate data from different sensors which are still under development and the final hardware is still not available. Another advantage is that the known ground truth of the simulation environment. This much cheaper than equipping a vehicle with those sensors, record lots of data and manually label the ground truth by humans. The 3D environment shall include urban and highway driving scenarios with balanced object categories like vehicles, pedestrians, trucks, terrain and street or free space to cover all levels for autonomous driving The simulation of a fish eye camera such as next generation lidar will be carried out in the thesis on the same Unity 3D framework, the generated images and point cloud data are used to generate different data sets. The final goal is to use this for training different models and test them on a real environment. Qualitative test are carried out by benchmarking the data sets with the aid of different algorithms. The aim of this thesis is to study the different approaches with which CNNs could be used in the task of depth estimation from a single fisheye camera image (180 degree FoV) for Autonomous Driving.

Humanoid robot learns walking by human demonstration

Juncheng Hu

Tue, 14. 8. 2018, Room 131

In this thesis, a method designed for making the humanoid robot walking is developed by using the Q learning based on MLMP-CPG and wrist sensors. Machine learning has demonstrated a promising feature in many fields including robotics. However, the supervised learning algorithms are more often applied. However, supervised learning like neural networks always need a massive amount of data to train, which is sometimes not permitted in the real situation. Although not much data is required in reinforcement learning, it needs many attempts in its environment thus concluding a strategy. For a humanoid robot, it is not allowed to have too many wrong attempts because a fall may lead to the injury of joints. In this thesis, a method that the robot learns walking with the help of a human can avoid accidental fallings is proposed.

Digital Twin Based Robot Control via IoT Cloud

Tauseef Al-Noor

Tue, 14. 8. 2018, Room 131

Digital Twin (DT) technology is the recent key technology for Industry 4.0 based monitoring and controlling industrial manufacturing and production. There are a lot of researches and development happening on DT based robot control. Monitoring and controlling the robot from a remote location is a complex process. In this research work, I have developed a prototype for controlling a robot using DT and cloud computing. Different technologies and techniques related to Digital Twin have been researched and analyzed to prepare an optimal solution based on this prototype. In this work, the latency of different types of machine to machine (M2M) communication protocols is observed. Different type of network protocols such as AMQP, MQTT, and HTTP has a lot of latency variation in the end to end data transfer communication. Furthermore, different external factors impact on persistent communication. For example, the cloud computing service as like as Azure?s data processing and throughput is not constant-time. A robot controlling mechanism expects a minimum constant time response for the quality of service. In this research, the main focus was to minimize communication latencies for a remote robot controller in a cloud-based communication. Finally, an average quality of service in the range of 2-5 seconds for persistent robot communication has been achieved based on different setup.

Vision-based Mobile Robotics Obstacle Avoidance with Deep Reinforcement Learning

Zishan Ahmed

Wed, 8. 8. 2018, Room 131

Obstacle avoidance is a fundamental and challenging problem for autonomous navigation of mobile robots. In this thesis, the problem of obstacle avoidance in simple 3D environments where the robot has to rely solely on a single monocular camera is considered. Inspired by the recent advantages of deep reinforcement learning (DRL) in Atari games and understanding highly complex situations in Go, the obstacle avoidance problem is tackled in this thesis as a data-driven end-to-end deep learning approach. An approach which takes raw images as input, and generates control commands as output is presented. The differences between discrete and continuous control commands are compared. Furthermore, a method to predict the depth images from monocular RGB images using conditional Generative Adversarial Networks (cGAN) is presented and the increase in learning performance by additionally fusing predicted depth images with monocular images is demonstrated.

Deep Convolutional Generative Adversarial Networks (DCGAN)

Indira Tekkali

Tue, 24. 7. 2018, Room 132

Generative Adversarial Networks (GAN) have made great progress in the recent years. Most of the established recognition methods are supervised, which have strong dependence on image labels. However obtaining large number of image labels is expensive and time consuming. In this project, we investigate the unsupervised representation learning method that is DCGAN. We base our work on previous paper by Radford and al., and aim to replicate their results. When training our model on different datasets such as MNIST, CIFAR-10 and Vehicle dataset, we are able to replicate some results for e.g. smooth transmission.

Using Transfer Learning for Improving Navigation Capabilities of Common Cleaning Robot

Hardik Rathod

Tue, 10. 7. 2018, Room 131

A lot of robotic vacuum cleaners fail during the cleaning task because they get stuck under furniture or within cords or some other objects on the floor. Once such situation occurs, the robot is hardly able to free itself. One possible cause of this behavior is insufficient information of the environment, the robot enters. In unstructured environments, recognition of objects has been proven to be highly challenging. By executing an analysis of the environment before the cleaning operation starts, the robot will be aware of the objects around it, especially those that might harmful in the navigation. Methods from machine learning have been investigated and tested as they give impressive results in object detection tasks. Taking adequate actions according to objects in the environment helps to overcome or reduce the possibilities to getting stuck the robot under the objects, and eventually it reduces the effort of the customers. The insight from this analysis has been incorporated within the locomotion behavior of a dummy robot.

Vergence control on humanoid robots

Torsten Follak

Mon, 9. 7. 2018, Room 131

For the orientation in the 3D-space, a good depth is needed. This estimation is reached through effective stereoscopic vision. There the disparity between both eyes images is used to derive the 3D-structure. Therefore, it is important that both eyes are fixating at the same point. This fixation is managed by vergence control. To implement and use vergence control in robotics, different approaches exits. In this talk three of them are shown. A short overview of the two first is given, while the third one is presented in detail.

Docker for machine learning

Alexander J. Knipping and Sebastian Biermann

Tue, 3. 7. 2018, Room 131

Handling software dependencies for research and or production environments often comes with a certain amount of complexity. Libraries like TensorFlow or PyTorch don't always behave in the same way across several major version releases, especially in combination with various other third-party libraries, different Python versions and CUDA toolkits. Several solutions such as anaconda, virtualenv or pyenv have emerged from the Python community, but managing those in regards to reproducibility and portability often feels clumsy and leads to unexpected errors, especially for system administrators. In this presentation we will evaluate if Docker containers can be a more efficient way to encapsulate project code with its dependencies, to build once, ship anywhere. For demonstration we have used Docker to train a machine learning model able to recognize 194 birds by their calls, through a variation of an existing, VGG based, model trained on Google's Audioset and using their feature extractor for our own classes. Training was then performed on over 80 000 audio files of ten to twenty seconds length on nucleus. We will demonstrate how we have used Docker in our workflow from developing the model, training it on the nucleus node to deploying the model into a productive environment for users to query it. The aim of our project is to provide both users and system administrators an overview of how Docker works, what its benefits and costs are and if it's a viable option to use in typical machine learning workflows and environments.

Humanoid robots learn to recover perturbation during swing motion in frontal plane: mapping pushing force readings into appropriate behaviors

Ibrahim Amer

Tue, 19. 6. 2018, Room 131

This thesis presents a learning method to tune recovery actions for humanoid robot during swinging movement based on central pattern generator. A continuous state space of robot is learned through self-organized map. A disturbance detection technique is proposed based on robot states and sub-states. Predefined recovery actions space are used in this thesis and they are composed of non-rhythmic patterns. A hill climb algorithm and a neural network have been used to tune the non-rhythmic patterns parameters to obtain the optimum values. A humanoid robot NAO was able to recover from disturbance with an adaptive reaction based on disturbance amplitude. All experiments were done on Webots simulation.

Humanoid robot grasping in 3D space by learning an inverse model of a central pattern generator

Yuxiang Pan

Tue, 19. 6. 2018, Room 131

Grasping is one of the most important functions of humanoid robots. However, an inverse kinematics model for the robot arm is required to reach an object in the workspace. This model can be mathematically described using the exact robot parameters, or it can be learned without a prior knowledge about these parameters. The later has an advantage as the learning algorithm can be generalized to other robots. In this thesis, we propose a method to learn the inverse kinematics model of NAO humanoid robot using a multilayer perceptron (MLP) neural network. Robot actions are generated through the multi-layered multi-pattern central pattern generator (CPG) model. The camera captures the information of the object provided by the ArUco markers, the MLP model provides the desired arm configurations to reach the object, and then the CPG parameters are calculated to move the arm from its current position into the goal position. The proposed model have been tested in simulation, and on the real robot where a soft sensory robotic gripper was used to interact with a human subject (tactile servoing). Grasping was done using both the learned inverse model and the sensory feedback.

Scene Understanding on a Humanoid Robotic Platform Using Recurrent Neural Networks

Saransh Vora

Wed, 13. 6. 2018, Room 131

Since near perfect levels of performance have been reached for object recognition using convolutional neural networks. The ability to describe the content and organization of a complex visual of the scene is called scene understanding. In this thesis the deterministic attention model has been used with back propagation with two different pre-trained encoder CNN models along with a RNN as a decoder to generate captions. The trained attention model is then used for a humanoid robot to describe the scene. This would represent first step towards robotic scene understanding. The robot can not only associate words with images but it can also point at the locations of features which are attended to and locate them in space.

Transferring deep Reinforcement Learning policies from simulations to real-world trajectory planning

Vinayakumar Murganoor

Tue, 5. 6. 2018, Room 131

Machine learning is really progressed a lot in recent days but most of the applications and demonstrations are done in simulated environments, especially with continuous control tasks. When it comes to continues control tasks, the reinforcement learning algorithms are proven to produce the good policies. In this project, the problem of trajectory planning is solved using reinforcement learning algorithms, where the simulated trained agent in the real-world moves the RC car from any given point A to point B with no training in real world itself. Also identified the minimum parameters that influence the agent behavior in the real world and listing out the problems and solutions found during the transfer of the policy from simulation to the real world.

Investigating dynamics of Generative Adversarial Networks (GANs)

Vivek Bakul Maru

Tue, 29. 5. 2018, Room 131

Generative Adversarial Networks (GANs) are very recent and promising approach in generative models. GANs are the approaches to solve problems by unsupervised learning using deep neural networks. GANs work on an adversarial principle, where two different neural networks are fighting with each other to get better. This research project aims to understand the underlying mechanism of GANs. GANs certainly have proved to have an edge over all the existing generative models like Variational autoencoders and Autoregressive models but they are known to suffer instability while training. Implementation research in this project focuses on investigating the issues regarding training of GANs and the convergence properties of the model. Apart from vanilla GAN, this project also focuses on the extension of regular GAN using convolutional neural networks, called Deep Convolutional GAN and one of the very recently proposed approaches called, Wasserstein GAN. Analysis of the travel of the loss functions insights into the convergence properties. Training of these models on multiple datasets allowed to compare and observe the learning of both the networks in GAN.

Design and Fabrication of Complex Sensory Structure of Curving and Pressure Sensors for Soft Robotic Hand

Vishal Ghadiya

Wed, 23. 5. 2018, Room 131

This Research project represents the prototype design of the complex sensory structure for a soft hand. This can be easily adapted to the soft material like silicon. A superposition of four piezoresistive pressure sensors and one curving sensor was arranged on the inner face of each finger. This research focuses on the design of flexible pressure and curving Sensors, in particular to the response of force sensitive resistor based pressure sensor. Thanks to the multi-layered design of Sensors, the structure was able to measure the curve of the finger and amount of tactile pressure applied by the object grasped to the hand. Sixteen pressures sensor and four curving sensors with Velostat as piezoresistive layer were designed with a diversity of electrode material i.e. conductive thread, with and without using of conductive fabric. The multilayer structure of pressure and the curving sensor can be placed on the inner face of the soft hand and easily evaluate the properties of the object such as size and stiffness of the object.

Erforschung der Rolle von Aufmerksamkeit in rekurrenten Modellen von deep reinforcement learning

Danny Hofmann

Wed, 9. 5. 2018, Room 1/131

Viele Probleme erfordern das erlernen einer Strategie direkt auf den Ausgabedaten der Umgebung, wie Pixelbilder einer Simulation oder auch Kamerabilder eines Roboterarmes. Es wird die Theorie von RAM und der Glimps-Aufmerksamkeit besprochen (Mnih, Heess und Graves 2014) dazu wurde der von Jung 2017 implementierte Algorithmus Asynchronous Attentional Advantage Actor-Critic (A4C), mit kontinuierlichen Aktionsräumen kompatibel gemacht. Die so neu gewonnene Eingabe und Ausgabe Verarbeitung wurde mithilfe eines einfach simulierten Roboterarmes getestet. Es wird Besprochen wie verschiedene Konfigurationen der Simulation zeigen, welche Informationen für den Aufmerksamkeitsmechanismus von Bedeutung sind und wie es dem Agenten gelingt einen Lösungsweg zu finden. Die Ergebnisse lassen einen Vergleich mit anderen Ansätzen des Deep Reinforcement Learnings zu. Unter anderem werden die entstandenen Ergebnisse mit den von Lötzsch 2017 entwickelten DDPG-Varianten verglichen. Es konnte mit A4C keine schnellere Lösung als mit den DDPG-Varianten gefunden werden, jedoch ist es mit A4C möglich nach dem Ende-zu-Ende-Prinzip zu lernen. Demnach lernte A4C direkt auf Bilddaten, wohingegen in den Implementierungen von Lötzsch 2017 eine Abstrahierung der Umwelt benötigt wurde.

A Framework for Adaptive Sensory-motor Coordination of a Humanoid-Robot

Karim Ahmed

Wed, 2. 5. 2018, Room 1/131

This project was done over the research area of sensory-motor coordination on humanoid robots and the ability to generate the coordination for new motor skills. The goal is to modulate the sensory motor connections that generate independently the same motor coordination demonstrated using the Central Pattern Generator (CPG). We propose two neural networks, one network for extracting the coordination features of a robot limb moving by the CPG. And the other network for creating a similar coordinated movement on the other limb moving by simple sensory-motor connections (without the CPG). Thanks to the proposed model, different coordination behaviors were presented at the sensory-motor level. A coordinated rhythmic movement generated by the CPG in the first stage can be demonstrated only by a simple sensory motor network (a reflexive controller) in the next stage where no CPG network is involved.

Knowledge Extraction from Heterogeneous Measurement Data for Datasets Retrieval

Gharbi Ala Eddine

Mon, 30. 4. 2018, Room 1/367

Due to the exponential increase of data amount produced on daily basis, innovative solutions are developed to tackle the problem of properly managing and organizing it with crisp and recent technologies such as Artificial Intelligence (AI) frameworks and Big Data Management(BDM) tools. The need for such solutions rises from the fact that our everyday interconnections soars towards pure digitalization. Therefore, leading Information Technology companies strive to come up with ideas to handle this bulky amount of data in the most efficient and concise way. The challenge faced with this huge data amount is not only to properly organize it but rather to make use of it. That is deriving knowledge from unstructured data is as important as structuring it in an effective way. Throughout this thesis, knowledge derivation techniques are applied on the available data in IAV GmbH. Data can be described as data files used to test the implemented software components and hence the importance of its proper organization. This Master thesis investigates and develops a prototypical solution for the organization of data sets as well a concept implementation for additional information extraction from data files. Different problems and solutions related to Knowledge Discovery (KD) and data organization are presented and discussed in details. Furthermore, an overview of the frameworks and algorithms used Data Minig (DM) is given as well.

Gait Transition between Simple and Complex Locomotion in Humanoid Robots

Sidhdharthkumar Vaghani

Mon, 23. 4. 2018, Room 1/367

This project presents the gait transition between the rhythmic and the non-rhythmic behavior during walking of a humanoid robot Nao. In biological studies, two kinds of locomotion behaviors were observed during cat walking on a flat terrain and walking on a ladder (simple and complex locomotion). In this work, both locomotion behaviors were produced on the robot using the multi-layers multi-patterns central pattern generator model (Nassour et al.). We generate the rhythmic behavior from the non-rhythmic one based on the frequency of interaction between the robot feet and the ground surface during the complex locomotion. While complex locomotion requires a sequence of descending control signals to drive each robot step, simple locomotion requires only a triggering signal to generate the periodic movement. The overall system behavior fits with the biological finding in cat locomotion (Marlinski et al.).

On the role of cortex-basal ganglia interactions for category learning: A neuro-computational approach

Francesc Villagrasa Escudero

Mon, 16. 4. 2018, Room 1/367a

Both the basal ganglia (BG) and the prefrontal cortex (PFC) are involved in category learning. However, their interaction in category learning remains unclear. A recent theory proposes that the basal ganglia, via the thalamus, slowly teach the PFC to acquire category representations. To further study this theory, we propose here a novel neuro-computational model of category learning which performs a category learning task (carried out by monkeys in the past to also study this theory). By reproducing key physiological data of this experiment, our simulations show evidence that further support the hypotheses held by the theory. Interestingly, we show that the fast learner (BG) with a performance of 80% correctly teaches a slow learner (PFC) up to 100% performance. Furthermore, new hypotheses obtained from our model have been confirmed by analyzing previous experimental data.

Verbesserung eines Aufmerksamkeitsmodells mit Deep Learning und Data Augmentation

Philip Hannemann

Thu, 22. 3. 2018, Room 1/309

Das von Frederik Beuth und Fred H. Hamker entwickelte Aufmerksamkeitsmodell wurde unter Nutzung des Coil Datensatzes getestet und Verbesserungen implementiert. Durch Anpassungen des im Aufmerksamkeitsmodell integrierten CNN und Optimierung der Struktur der verwendeten Layer, wurde es möglich, die bisherige Erkennungsrate auf Basis eines realitätsnahen Hintergrundes von 42% auf 60% zu erhöhen. Neben diesen CNN Modifikationen bestand ein weiterer Schwerpunkt der Arbeit in der Vereinfachung der eingesetzten Datenbank und mehrfach wiederholten Test- und Anlernvorgängen unter Nutzung der Methodik des Deep Learning. Der im Coil Datensatz enthaltene Hintergrund der Coil Bilder wurden hierfür testweise entfernt und die Coil Objekte wurden mit unterschiedlichen Hintergründen für den Lernprozess verwendet.

Untersuchung der Auswirkungen von Feedback in einem Modell zur Ziffererkennung

Miriam Müller

Wed, 14. 3. 2018, Room 1/368

Der überwiegende Teil des Informationsflusses im Gehirn erfolgt über rekurrente Verbindungen. Um die Auswirkungen von Rückwärtsverbindungen in einem künstlichen neuronalen Netzwerk zu untersuchen, wurden einem Modell zur Ziffererkennung diese Verbindungen in einem zwei- und dreischichtigen Modell hinzugefügt. Die dritte Schicht ist dabei kategoriespezifisch. Der unüberwachte Lernalgorithmus wurde zusätzlich durch einen externen Aufmerksamkeitseinfluss ergänzt. Diese Veränderungen werden mit dem ursprünglichen zweischichtigem Feedforward-Modell verglichen. Weiterhin werden in diesem Vortrag die Auswirkungen der Feedbackverbindungen auf ein zwei- und dreischichtiges Modell vorgestellt.

Continuous Deep Q-Learning with Model-based Acceleration

Oliver Lange

Tue, 20. 2. 2018, Room 1/273

Model-free reinforcement learning has been successfully applied to a range of challenging problems and has recently been extended to handle large neuronal network policies and value functions. However, the sample complexity of model-free algorithms tends to limit their applicability to physical systems. To reduce sample complexity of deep reinforcement learning for continuous control tasks a continuous variant of the Q-learning algorithms was introduced. This talk will be about the algorithm called normalized advantage functions (NAF) and the incorporation of a learned model to accelerate model-free reinforcement learning.

Inhibition and loss of information in unsupervised feature extraction

Arash Kermani

Wed, 14. 2. 2018, Room 1/273

In this talk inhibition as a means for inducing competition in unsupervised learning will be discussed. The focus will be on the role of inhibition in overcoming the loss of information and the loss of information caused by inhibition itself. A learning mechanism for learning inhibitory weights will be introduced and a hypothesis for explaining the function of VIP dis-inhibitory cells in the brain will be proposed.

Development of a Self-Organizing Model of the Lateral Intraparietal Cortex

Alex Schwarz

Tue, 30. 1. 2018, Room 1/336

Visual perception of objects in space depends on a stable representation of the world. Such a robust world centered depiction would be disrupted with every eye movement. The Lateral Intraparietal Cortex (LIP) in the human brain has been suggested to solve this problem by combining the retino-centric representations of objects from the visual input with proprioceptive information (PC) and corollary discharge signals (CD). Thereby it enables a steady positioning of objects in a head-centered frame of reference. Bergelt and Hamker (2016) had build a model of the LIP that included four-dimensional maps of the LIP. In this master thesis a modification of this model is presented introducing hebbian and anti-hebbian learning to achieve a connection pattern that shows a similar behavior without the necessity of a fixed predefined weighting pattern. Thereby a reduction of dimensions is possible, as well as a higher biological plausibility. The model shows the influence of both signals, PC and CD, on the representation of stimuli over the time of a saccade.

Using Recurrent Neural Networks for Macroeconomic Forecasting

Thomas Oechsle

Mon, 29. 1. 2018, Room 1/367a

Despite the widespread application of Recurrent Neural Networks (RNNs) in various fields such as speech recognition, stock market prediction and handwriting recognition, they have so far played virtually no role in the field of macroeconomic forecasting (i.e. in the prediction of variables such as GDP, inflation or unemployment). At the moment, not a single central bank in the world uses RNNs for macroeconomic forecasts. The purpose of the talk is to highlight the potential of RNNs in improving macroeconomic forecasting. First, the history of and current knowledge on RNNs are reviewed. Then, the performance of RNNs in forecasting US GDP figures is described and compared to that of the Survey of Professional Forecasters (SPF).

Implementation und Evaluation eines Place-Cell Modells zur Selbstlokalisation in einer Robotik Umgebung

Toni Freitag

Mon, 22. 1. 2018, Room 1/205

Die Orientierung im Raum stellt für Lebewesen und mobile Roboter eine essentielle Funktion dar. Die Lebewesen nutzen dazu Geschwindigkeit und Richtung ihrer Bewegung. Samu et al. (2009) stellten ein Modell eines neuronalen Netzes vor, um die Orientierung im Raum zu verbessern. Dazu wurden bei Ratten erforschte Rasterzellen (engl. grid cells) und Ortszellen (engl. place cells) verwendet. Wird sich dieses Modell auch in einer realen Umgebung beweisen? Es wurde das neuronale Netz in einer realen Umgebung trainiert. In einer anschließenden Testfahrt wurde versucht, die aktuelle Position des mobilen Roboters in der Arena zu bestimmen. In diesem Vortrag werden das zugrundeliegende Modell und die verschiedenen Varianten der Positionsbestimmung vorgestellt.

Inhibition and loss of information in unsupervised feature extraction

Arash Kermani

Wed, 17. 1. 2018, Room 1/367a

In this talk inhibition as a means for inducing competition in unsupervised learning will be discussed. The focus will be on the role of inhibition in overcoming the loss of information and the loss of information caused by inhibition itself. A learning mechanism for learning inhibitory weights will be introduced and a hypothesis for explaining the function of VIP dis-inhibitory cells in the brain will be proposed.

The connections within the visual cortex including the Pulvinar

Michael Göthel

Wed, 20. 12. 2017, Room 1/367a

How are the regions in the brain are connected to each other? This question was examined in the past a lot by different scientists to figure out how the information are transmitted between the areas V1 and V2. This Presentation will summarize some of them to generate a model which covers all layers in V1, V2 and includes the pulvinar.

Neuro-computational model for spatial updating of attention during eye movements

Julia Bergelt

Mon, 27. 11. 2017, Room 1/367a

During natural vision, scene perception depends on accurate targeting of attention, anticipation of the physical consequences of motor actions, and the ability to continuously integrate visual inputs with stored representations. For example, when there is an impending eye movement, the visual system anticipates where the target will be next and, for this, attention updates to the new location. Recently, two different types of perisaccadic spatial attention shifts were discovered. One study shows that attention lingers after saccade at the (irrelevant) retinotopic position, that is, the focus of attention shifts with the eyes and updates not before the eyes land to its original position. Another study shows that shortly before saccade onset, spatial attention is remapped to a position opposite to the saccade direction, thus, anticipating the eye movement. In this presentation, it will be shown how a published computational model for perisaccadic space perception accounting for several other visual phenomena can be used to explain the two different types of spatial updating of attention.

Decorrelation and sparsification by homeostatic inhibitory plasticity in cooperation with voltage-based triplet STDP for receptive field formation

René Larisch

Mon, 20. 11. 2017, Room 1/367a

In the past years, many models to learn V1 simple-cells from natural scene input which use spike-timing dependent plasticity have been published. However, these models have several limitations as they rely on theoretical approaches about the characteristics of the V1 simple-cells instead of biologically plausible neuronal mechanisms. We implemented a spiking neural network with excitatory and recurrently connected inhibitory neurons, receiving inputs from Poisson neurons. For excitatory synapses we use a phenomenologically voltage-based triplet STDP rule, from Clopath et al. (2010). The inhibitory synapses are learned with a symmetric inhibitory STDP rule, from Vogels et al. (2011), inspired by observations of the auditory cortex. We show that the cooperation of both phenomenological motivated learning rules leads to the emergence of a huge variety of known neuron properties. We focus on the role of inhibition on these neuron properties. To evaluate how inhibition influences the learning behavior, we compare model implementations without inhibition and with different levels of inhibition. Moreover, to separate the impact on neuronal activity from the impact on the learned weights, we deactivated inhibition after learning. We found that stronger inhibition sharpened the neuron tuning curves and decorrelated neuronal activities. Furthermore, we see that inhibition can improve the input encoding quality and coding efficiency.

Automatische Landmarken-Identifizierung aus einer hoch genauen 3D-Puntkwolke für die Selbstlokalisierung autonomer Fahrzeuge unter Anwendung eines biologisch inspirierten Algorithmus

Sebastian Adams

Mon, 6. 11. 2017, Room 1/367a

Autonomes Valet-Parken ist ein aktueller Bereich der Forschung in Richtung des autonomen Automobils. Eine präzise Selbstlokalisierung stellt in Parkhäusern eine Herausforderung dar, da oft kein GPS verfügbar und die Umgebung sowohl monoton, als auch sehr dynamisch ist. Menschen können sich im Parkhaus erfolgreich orientieren. Dazu nutzen sie unter Anderen auch die Möglichkeit der Orientierung über prägnante Punkte oder Objekte als sogenannte Landmarken. Diese Landmarken sollen nun in einer hoch genauen Punktwolke identifiziert werden, um später durch autonome Automobile bei einer Fahrt durch das Parkhaus zur Eigenlokalisierung genutzt zu werden. Eine Punktwolke ist das Ergebnis mehrerer Messungen mit einem Laserscanner. Die resultierenden Punkte werden durch ihre x,y und z Komponente im Raum sowie Intensitätswerte beschrieben. Ein Vorteil der Nutzung von Punktwolken zur Landmarken-Identifizierung ist die genaue 3D Repräsentation der identifizierten prägnanten Objekte. Um Landmarken zu identifizieren wurden zwei biologisch inspirierte Algorithmen verwendet. Das Biological Model of Multiscale Keypoint Detection (Terzic et al., 2015) nutzt nachempfundene Zellen der im visuellen Kortex angesiedelten Neurone und verbessert so die Erkennungsleistung sowie die Robustheit von Keypoints. Als zweiter Algorithmus wurde das Saliency Attentive Model (Cornia et al., 2016) genutzt, ein neuronales Netz, das die Wahrscheinlichkeit vorhersagt, auf welche Regionen in einem Bild Menschen ihre Aufmerksamkeit richten würden. Im Anschluss wurde die Einzigartigkeit der salienten Punkte validiert und evaluiert, ob die Anzahl der identifizierten Landmarken theoretisch für eine erfolgreiche Selbstlokalisierung ausreicht.

Anpassung eines Neuronenmodells für visuelle Wahrnehmung an neue physiologische Daten gemessen an Affen

Vincent Marian Plikat

Wed, 27. 9. 2017, Room 1/367

A. Ziesche und F.H. Hamker haben 2011 ein biologisch motiviertes Neuronenmodell für visuelle Wahrnehmung entwickelt. Dieses Modell siedelt die Wahrnehmung in der Lateral Intraparietal Area (LIP) an. Es ist erfolgreich in der Modellierung des Saccadic Surpression of Displacement Experiments (SSD), sowohl in Verbindung mit Blanking und Masking. Dieses Modell wird jedoch von neuen physiologischen Befunden, gemessen an Affen, herausgefordert (Xu et al, 2012). Affen verhalten sich jedoch im SSD Experiment nicht exakt wie Menschen. Bei ihnen ist kein Effekt im Blanking zu finden (Joiner et al, 2013). Meine Arbeit befasst sich damit das Modell an die neuen physiologischen Daten anzupassen und Varianten zu finden, die die Unterschiede zwischen Menschen und Affen erklären können.

Veränderung der Wahrscheinlichkeiten von explorierten Aktionen durch die Plastizität der STN - GPe Verbindung

Oliver Maith

Mon, 4. 9. 2017, Room 1/336

Viele Theorien und Modelle der Basalganglien haben gezeigt, dass dieses System mittels Aktivität verschiedener Wege (indirekt, direkt, hyperdirekt) zwischen Cortex und Thalamus belohnte Aktionen anregen und andere hemmen kann. Die Funktion der Verbindung zwischen dem Nucleus subthalamicus (STN) und dem externalen Globus pallidus (GPe) ist in diesem Zusammenhang noch relativ unklar. In einer Vorgängerstudie wurde mittels eines Basalganglien Computermodels gezeigt, dass die Verbindung zwischen STN und GPe dafür sorgen kann, dass bestimmte alternative Aktionen während einer Explorationsphase bevorzugt werden. Es kommt somit nicht zu einem zufälligen Ausprobieren aller möglichen Aktionen. Die Funktion der Verbindung zwischen STN und GPe könnte somit sein, zu lernen, welche Aktionen in einer Situation schon einmal erfolgreich waren und welche nicht. Während einer Exploration würden dann die erfolgversprechenderen Aktionen bevorzugt werden. Wie dies mittels der Implementation einer Lernregel zwischen STN und GPe realisiert wurde und wie hierfür zunächst das Basalganglienmodel der Vorgängerstudie angepasst werden musste, werden Schwerpunkte der Präsentation sein.

Classification of Aircraft Cabin Configuration Packages using Machine Learning

Sreenath Ottapurakkal Kallada

Fri, 18. 8. 2017, Room 1/336

Aircraft manufacturers are constantly trying to enhance the comfort of the passengers in the cabin. But the tradeoff with increasing the comfort adversely affects the maximum number of passengers an aircraft can carry. This will gradually affect the overall profit for the airlines. Designing a good aircraft cabin requires careful analysis of the various monuments which are placed inside the cabin and their corresponding features. These monuments can range from the crew rest compartment to the galleys where the food is stored. Each of these monuments has exclusive features which can be used as a guiding metric to revise the cabin architecture. These revisions of cabin layouts may occur frequently during the development process where many marketing layouts are pitched to the prospective customers but not all of them make it to the final production line. The A350 program from Airbus is analyzed in this master thesis where the cabin layout data is retrieved from various XML sheets using Python parsers and merged together into a processible format. Various feature selection methods belonging to filter, wrapper and embedded methods are used to analyze the important features of the monuments and cabin layouts. Finally, three tree classifiers (single decision tree classifier, ensemble tree classifier, boosted tree classifier) are fitted on the datasets to classify the cabin layouts and monuments. The performance of the classifiers and the analysis of the relationship between the predictors and the target variable are recorded.

Automating Scientific Work in Optimization

Prof. Dr. Thomas Weise

Fri, 21. 7. 2017, Room 1/336

In the fields of heuristic optimization and machine learning, experimentation is the way to assess the performance of an algorithm setup and the hardness of problems. Good experimentation is complicated. Most algorithms in the domain are anytime algorithms, meaning they can improve their approximation quality over time. This means that one algorithm may initially perform better than another one but converge to worse solutions in the end. Instead of single final results, the whole runtime behavior of algorithms needs to be compared (and runtime may be measured in multiple ways). We do not just want to know which algorithm performs best and which problem is the hardest - a researcher wants to know why. In this paper, we introduce a methodology which can 1) automatically model the progress of algorithm setups on different problem instances based on data collected in experiments, 2) use these models to discover clusters of algorithm (and problem instance) behaviors, and 3) propose reasons why a certain algorithm setup (problem instance) belongs to a certain algorithm (instance) behavior cluster. These high-level conclusions are presented in form of decision trees relating algorithm parameters (instance features) to cluster ids. We emphasize the duality of analyzing algorithm setups and problem instances. Our methodology is implemented as open source software and applied in two case studies. Besides its basic application to raw experimental data, yielding clusters and explanations of quantitative algorithm behavior, our methodology also allows for qualitative conclusions by feeding it with data which is normalized with problem features or algorithm parameters. It can also be applied recursively, e.g., to further investigate the behavior of the algorithms in the cluster with the best-performing setups on the problem instances belonging to the cluster of hardest instances. Both use cases are investigated in the case studies.

Erweiterung der S2000 Sensorplattform für die Verwendung auf einer mobilen Plattform.

Shadi Saleh

Thu, 13. 7. 2017, Room 1/367

Stereo cameras are important sensors to provide good recognition of objects with color perception and very high resolution. It has been used in a wide range of automotive application prototypes, especially in advanced driver assistant system (ADAS). In this thesis, different methods based only on a stereo vision system are proposed, in order to estimate the ego-motion within a dynamic environment and extract information about the moving objects and stationary background. This is realized using Intenta S2000 sensor, a new generation of intelligent cameras that generates 3D point cloud at each time stamp. However, constructing the static objects map and dynamic objects list is a challenging task, since the motion is introduced by a combination between the movement of the camera and objects. The first proposed solution is achieved based on the dense optical flow, which illustrates the possibility of estimation ego-motion, and splitting the moving object from the background. In the second proposed solution, the ego-motion within a dynamic environment is realized using methods for solving a 3D-to-3D point registration problem. The iterative closest point (ICP) method is used to solve this problem by minimizing the geometric differences function between two consecutive 3D point clouds. Then the subtraction between the two aligned clouds is computed where the output result represents the subset of the 3D points from the objects in the environment. The most effective method for estimating the multiple moving and static objects is presented in the third proposed solution. The ego-motion is estimated based on the ICP method and then the post-processed 3D point cloud is projected on a 2D horizontal grid and based on the additional features to each grid cells is possible to distinguish between different grid cells into free, occluded and border. The cells status can be used later to construct the static object map and dynamic object list. Determination of noise regions is an essential step, which can be used later to generate 3D point cloud with low level of noise. This is done depending on the unsupervised segmentation methods.

Multi-GPU Simulation of Spiking Neural Networks

Joseph Gussev

Tue, 16. 5. 2017, Room 1/367

With the increasing importance of parallel programming in the context of achieving more computation power, general purpose computation on graphics processing units (GPGPU) has become more relevant for scientific computing. An example would be the simulation of spiking neural networks. During this talk, an introduction to GPU programming as well as the basics of spiking network simulation on GPUs is given. Afterwards, the concepts of Multi-GPU spiking network simulation used by the neural network simulators HRLSim and NCS will be explained.

Sound Source Detection and Localizing Active Person during Conversation

Ranjithkumar Jayaraman

Tue, 11. 4. 2017, Room 1/367

Robots has been emerging for interaction with people. Finding the correct sound source is very important for engaging in conversation between people or for being a co-worker with human. This project aims to find the sound direction and to get the source person of the sound in NAO robot. Sound direction can be determined easily with two microphones in free space, but when it placed in uneven spherical head of robot became complex because of diffraction and scattering of sound waves at the surfaces of robot head. Linear and logistic regression algorithm had been used to localize the direction of the sound source. OpenCV libraries had been used to find the human face in the sound direction. Finally we associate the sound source direction with the detected human faces. This algorithm promises to return robots more interactive while lessening to a conversation or working together with human co-workers.

Modelling attention in the neurons of area MT

Christine Ripken

Wed, 15. 3. 2017, Room 1/208

In order to model attention for motion detecting neurons, responses of the Model of Neuronal Responses in visual area MT (middle temporal area) of Simoncelli and Heeger were processed by the Mechanistic Cortical Microcircuit of Attention of Beuth and Hamker. The Mechanistic Cortical Microcircuit of Attention has replicated a great range of data sets, only implementing four basic neural mechanisms: amplification, spatial pooling, divisive normalization and suppression. The Model of Neuronal Responses in Visual Area MT is replicating very well recordings from area MT. The aim was to replicate responses of a representative MT neuron, whose data was collected by Maunsell and Lee. The defense will give a overview over the models and discuss briefly the results, which do not yet replicate the physiological data to a satisfying extent.

Untersuchung der Auswirkungen von Feedback in einem Modell des visuellen Systems

Daniel Johannes Lohmeier-von Laer

Wed, 15. 3. 2017, Room 1/208

Das visuelle System ist nicht nur auf Feedforward- und Laterale-Verarbeitung begrenzt, es gibt auch Rückverbindungen von Arealen höherer Hierarchie in frühere kortikale Areale. Welche Rolle diese Rückverbindungen bzw. Feedback spielt, ist noch nicht ausreichend verstanden und im Vergleich zur Feedforward-Hierarchie weniger erforscht. In einem Modell, dass den primären und sekundären visuellen Kortex umfasst, wurde eine Feedbackverbindung von V2 nach V1 hinzugefügt. Da die Wirkung des Feedbacks auf der Aktivität der V2-Neuronen beruht, wird diese Aktivität zusätzlich gesteuert werden, indem Aufmerksamkeit in der V2-Schicht simuliert wird. Wie stark der Feedbackstrom in Relation zu Feedforward ist, was er bewirkt und welche Auswirkungen die Simulation von Aufmerksamkeit auf die rezeptiven Felder der Neurone hat, wird in diesem Vortrag vorgestellt.

Inhibition decorrelates neuronal activities in a network learning V1 simple-cells by voltage-based STDP and homeostatic inhibitory plasticity

Rene Larisch

Tue, 14. 3. 2017, Room 1/208

In the past years, many models to learn V1 simple-cells from natural scene input which use spike-timing dependent plasticity are published. However, these models have several limitations as they rely on theoretical approaches about the characteristics of the V1 simple-cells instead of biologically plausible neuronal mechanisms. We implemented a spiking neural network with excitatory and recurrently connected inhibitory neurons, receiving inputs from Poisson neurons. The excitatory synapses are learned with a phenomenologically voltage-based triplet STDP rule, from Clopath et al. (2010), and the inhibitory synapses are learned with a symmetric inhibitory STDP rule, from Vogels et al. (2011). To analyze the effect of inhibition during learning, we compare model implementations without inhibition and with different levels of inhibition. Furthermore, to study the impacts on the neuronal activity, we compare a model learned with inhibition to itself without inhibition. We see, that inhibition leads to better differentiated receptive fields, with decreased correlation between the neuron's activities and improved robustness to small input differences. Furthermore, we can reproduce observations from other computational V1 simple-cell models and from physiological experiments.

Fast simulation of neural networks using OpenMPI

Florian Polster

Tue, 14. 3. 2017, Room 1/208

In order to make the simulation of very large neural networks possible, ANNarchy has been extended by an MPI implementation. ANNarchy uses Cython to transfer the simulation setup from the Python application to the generated C++ simulator, which is unsuitable in a distributed context and therefore replaced by a combination of ZeroMQ (for transmission) and Protobuf (for serialization). For the distributed simulation a neural network is partitioned into subgraphs by splitting populations and projections by post-synaptic neurons.

Deep Reinforcement Learning in robotics

Winfried Lötzsch

Mon, 27. 2. 2017, Room 1/367

For creating intelligently and autonomously operating robotics systems, Reinforcement Learning has longly been the method of choice. Combining the Reinforcement Learning principle with a deep neural network does not only improve the performance of these algorithms on commonly known robotics tasks, but also enables more complex tasks to be performed. I will show ways to implement this combination for both simulated and real robotics systems. One approach is to train the deep network to directly output motor commands for the robot [https://arxiv.org/pdf/1504.00702v5.pdf], but it can also be used to predict the success of a given action [https://arxiv.org/pdf/1603.02199v4.pdf]. Furthermore, recent work has shown, that using multiple learner instances in parallel improves training performance [https://arxiv.org/pdf/1602.01783v2.pdf].

Mechanisms for the Cortex-Basal Ganglia interactions in category learning via a computational model

Francesc Villagrasa Escudero

Wed, 1. 2. 2017, Room 1/208

We present a novel neuro-computational model for learning category knowledge via the interaction between the basal ganglia and the prefrontal cortex. According to this model the basal ganglia trains the acquisition of category knowledge in cortico-cortical projections. In particular, our model shows that the fast learning in the basal ganglia alone is not optimal for correctly classifying a large number of stimuli whereas, the combination between fast and slow learning in the prefrontal cortex produces stable representations that can perfectly categorize even a larger number of exemplars. Our model also adds novel predictions to be empirically tested. The basal ganglia is classically known for being involved in stimulus-reponse (S-R) learning. However, our results show that the striatum does not encode representations of full stimuli but of features. Therefore, we propose that the striatum learns feature-response (F-R) associations related to reward. In addition, our model predicts that the thalamic activation correlates with the final category decision throughout the whole experiment ? different from the striatum which is only strongly involved early in learning and from the prefrontal cortex which is only highly engaged later in learning.

Korrektur perspektivischer Verzerrungen in der virtuellen Realität für eine blickinvariante Objektlokalisation

Frank Witscher

Wed, 18. 1. 2017, Room 1/367

Perspektivische Varianz besteht seit Jahrzehnten als Problem für die Objekterkennung einer künstlichen Intelligenz. In einer virtuellen Versuchsumgebung wird diese Problematik durch das standardmäßige Renderingverfahren erweitert, da dieses Objekte verzerrt darstellt. Um hier dem steigenden Variantenreichtum der Darstellung eines Objektes entgegenzuwirken, wird in den Renderingprozess eingegriffen und die konventionelle Projektion durch eine sphärische ersetzt. Hierbei wird eine Minimierung der perspektivischen Verzerrung erreicht und zusätzlich die Praktikabilität der neuen Projektion präsentiert.

Evolving from 2D and 3D Morphable Models to mixed Models

Felix Pfeiffer

Wed, 14. 12. 2016, Room 1/336

One major subject of Computer Vision and Artificial Intelligence is the tracking of deformable Objects. A common approach is the usage of Distributed Point Models, which can hold the parameters of the object in question. Based on the infamous Image Alignment algorithm developed by Lucas and Kanade, Simon and Baker came up with improvements that included the alignment of not only position, scale and rotation but also its very basic form. Together with a set of clever and quite efficient algorithms it gained the name Active Appearance Model (AAM). A usual application would be the tracking of human faces, which can be useful for e.g. reading emotions. The common approach is to take a simple camera to get pictures (or whole series of pictures) and apply the AAM only in that 2D space. Since the input data is only 2 dimensional it is seems quite intuitive that this model has a bunch of limits modeling our in fact 3 dimensional world. In the 5 part paper series 'Lucas Kanade 20 Years on: A Unifying Framework', Simon and Baker stated that the model is easily applicable on images in 3D Space, like you would gain from MRI or CRT. The to-the-3rd-dimension-modified model is than called 3D Morphable Model (3DMM). So the model itself is able to work in 3D space, provided you have a 3D image as input which is hard. To get rid of the AAMs disadvantages regarding the modeling power, it would be neat if the best of the 2 models can be combined to gain a mixed model, the 2.5D Morphable Model: Using 2D input from a simple camera, but having the model residing in the 3rd dimension. Simon and Baker do exactly that in the 5th part of the already mention paper series. Sadly they found that *none* of the algorithms, that worked so well in AAMs and 3DMMs alone, are applicable. But luckily there are already solutions to it ... The seminar will have 2 major parts: The first one subjects the Image Alignment algorithm, its techniques and its evolution into the AAM without diving to deep into mathematics. After showing that it is easy to lift the AAM into 3D realm, I will present what has been thought about the development of the 2.5D Model, why it does not work as intended and two solutions that show how it can be done anyways.

Perspectives of Object Recognition and Visual Attention

Johannes Jung

Wed, 9. 11. 2016, Room 1/336

Visual attention is a diversified field of research, incorporating neuroscience, psychology and computer science. I would like to talk about different perspectives of visual attention and its influence on object recognition, based on the ideas of various attention systems. Traditionally saliency map models, emerging from psychological theories, are very important in this domain. However, recent attention models for object recognition often strongly benefit from machine learning ideas, such as deep learning, but partially neglect biological plausibility.

Modellierung einer Face-to-Face-Interaktion zwischen Mensch und Agent hinsichtlich der Übertragung emotionaler Werte durch Mimik

Sarah Paul

Mon, 29. 8. 2016, Room 1/336

Ziel dieser Arbeit bestand darin, im Rahmen eines Spieles, eine nonverbale Interaktion zwischen Mensch und Computer, auf Basis der Übertragung und Erkennung von Gesichtsausdrücken,zu erstellen. Im Rahmen des Vortages wird zunächst die Motivation und das Ziel der Arbeit genannt und anschließend die erstellte Mensch-Maschine-Interaktion anhand der Architektur vorgestellt. Es wird kurz auf die einzelnen Komponenten, die für diese Kommunikation notwendig sind eingegangen. Die erste ist die maschinelle Wahrnehmung von menschlichen Gesichtern und bestimmten Merkmalspunkten mit Hilfe des Active Appearance Model. Die Zweite ist das Facial Expression-Recognition-Modell, welches auf der Basis der wahrgenommenen Merkmalspunkte Action Units des Facial Action Coding System erkennt. Die Interpretation der Gesichtsausdrücke und Reaktion darauf erfolgt im Rahmen eines Spieles, in dem mit dem Computer über den Gesichtsausdruck kommuniziert und eine Entscheidung im Rahmen eines sozialen Dilemmas getroffen werden kann. Es wird während des Vortrages der Ablauf des Spiel näher erläutert. Abschließend wird auf die Qualität der einzelnen Komponenten sowie der Mensch-Computer-Interaktion als Ganzes eingegangen und Anhaltspunkte für mögliche weiterführende Untersuchungen und Entwicklungen genannt.

Predictive Place-cell Sequences for Goal-finding emerge from Goal Memory and the Cognitive Map: A Model

Valentin Forch

Mon, 4. 7. 2016, Room 1/336

Individuals need a neural mechanism enabling them to navigate effectively in changing environments. The model of predictive place-cell sequences for goal-finding is put forward to explain some neural activities found in rats which were observed in the context of navigation. A characteristic phenomenon is the occurrence of so called sharp wave ripples in the place-cells of hippocampus when rats are orienting themselves in a new environment. These place-cell sequences were shown to be predictive for future paths of these animals. The model also draws from the concept of the cognitive map, which should allow individuals to form a representation of their environment. I will introduce the theoretical framework of the cognitive map, present some specific findings relevant for the creation of the model, the model itself, and discuss some implications for future research.

Emotional attention and other amygdala functionalities

René Richter

Wed, 22. 6. 2016, Room 1/346

Emotional stimuli attract attention so the brain can focus its processing resources. But how do these stimuli acquire their emotional value, and how can they influence attention processes is still an open issue. Evidence suggests that this association might be learned through conditioning in the basal lateral amygdala (BLA) who also sends back feedback connections to the visual cortex (a possible top-down attention mechanism). The learning of the association on the other hand is strongly modulated by dopamine and the exact timing of its distribution is crucial. Accordingly, a rate-coded, biological realistic, neuro-computational model constructed of 3 combined functional models (Visual Cortex, Amygdala, Basal Ganglia Timing Circuit) will be presented. Moreover, the amygdala needs a specific range of functions to work well in this context and a detailed explanation of these and additional functions will round up the talk.

RoboCup 2016: An overview

John Nassour

Mon, 13. 6. 2016, Room 1/336

I will present an overview of the days at RoboCup in Leipzig (30 June to 4 July). Different types of completion are taking places: RoboCup Soccer, RoboCup Industrial, RobotCup Rescue, RobotCup@Home, and RobotCup Junior. Then I present the robotic side of our group's contribution in the stand 'Robots in Saxony'.

Spatial Synaptic Growth and Removal for Learning Individual Receptive Field Structures

Michael Teichmann

Mon, 6. 6. 2016, Room 1/336

A challenge in creating neural models of the visual system is the appropriate definition of the connectivity. Structural plasticity can overcome the lack of a priori knowledge about the connectivity of each individual neuron. We present a new computational model which exploits the spatial configuration of connections for the formation of synapses and demonstrate its functioning and robustness.

Learning Object Representations for Modeling Attention in Real World Scenes

Alex Schwarz

Mon, 9. 5. 2016, Room 1/336

Many models of visual attention exist, but only a few have been shown with real-world scenes. The Model of (Beuth and Hamker, 2015, NCNC) is one of these models. It will be shown how this model could adapted to work on real-world scenes, using a temporal continuity paradigm. The influence of a high post-synaptic threshold and a normalization mechanism will be shown, driving the learned weights into being background-invariant.

A computational model of Cortex Basal Ganglia interactions involved in category learning

Francesc Villagrasa Escudero

Tue, 19. 4. 2016, Room 1/336

We present a detailed neuro-computational model of Cortex-Basal Ganglia interactions that can account for a specific category-learning paradigm, the prototype distortion task. It proposes a novel principle of brain computation in which the basal ganglia read out cortical activity and, by their output, determine cortico-cortical connections to learn category specific knowledge. In order to better link computational neursocience with cognitive function, our model reproduces a physiological experimental and its main results. Finally, this model supports a recent hypothesis: the striatum 'trains' a slow learning mechanism in the Prefrontal Cortex to acquire category representations. Moreover, it provides insight into how the striatum encodes stimuli information.

Reinforcement Learning with Object Recognition in a Virtual Environment

Joseph Gussev

Wed, 23. 3. 2016, Room 1/367

Instead of being dependent on detailed external information from a complex environment, the reinforcement learning agent in this Bachelor Thesis can extract information from visual stimuli. It uses Q-Lambda-Learning and a recently developed attention-driven object localization model in a virtual environment. The implemented system is presented and results of learning analyzed.

Attentional deep networks

Priyanka Arora

Tue, 8. 3. 2016, Room 208

We all know what attention is and we all even know how attention is used in the field of biology. As a part of research, a deep dive conducted on what attention focuses on and how it is been used in the field of machine learning. The following seminar would deal with how images or data is identified using models built by taking Spatial attention and Feature based attention into consideration.Takeaway of the talk would include how these two attention based models differ in their working.

A Computational Model of Neurons in Visual Area MT

Tobias Höppner

Thu, 25. 2. 2016, To be announced

The model for motion estimation from Simoncelli and Heeger (1998) will be explained in detail. The Middle Temporal (MT) area of the brain are selective for velocity (both direction and speed). For a neural representation of velocity information in MT it is necessary to encode local image velocities. The model perfrorming these steps consists of two similar stages. Each stage computes a sum of inputs, followed by rectification and divisive normalization. The mathematical underpinnings of the model will be explained acompanied by simulution results.

Modelling / simulation of the exhaust path of a diesel TDI common rail engine with Emission standard EU 6, taking into account the real-time capability on a HIL simulator.

Sowmya Alluru

Thu, 25. 2. 2016, Room 1/336

Different model strategies such as zero dimensional and empirical methods are analyzed for the diesel engine exhaust system created for state variables such as temperature, pressure and mass flow in the exhaust pipes between the exhaust components. The models are evaluated to show the predicted results are in comparison with actual measurements. In addition, the real-time capability of the models in the Hardware-in-the-Loop (HIL) is estimated based on prediction function run-time metrics obtained by C++ and Matlab simulink implementation of the algorithms. Neural Networks show higher accuracy and succeed in predicting or regressing maximum variance in the inputs as compared to classical analytical methods (zero-D), SVM based kernel methods and simple polynomial regression methods.

Revealing the impairments of thalamic lesions using automatic parameter tuning for a neurocomputational model of saccadic suppression of displacement

Mariella Dreißig

Wed, 24. 2. 2016, Room 1/336

Several studies suggest, that the thalamus is crucial for maintaining the perception of visual stability across saccades. In this thesis the role of the thalamus in visual processing was investigated closer. With the help of a neurocomputational model, data from patients with focal thalamic lesions and from healthy controls were replicated by adjusting the model parameters in an automated fitting procedure. By mapping the model's performance onto the subjects' performance in the saccadic suppression of displacement task, the impacts of thalamic impairments on the perception of visual stability could be revealed.

Kombination von STDP- und iSTDP-Lernregeln zum Lernen von rezeptiven Feldern in einem Modell des primären visuellen Kortex unter Berücksichtigung von lateraler Exzitation
(Combination of STDP and iSTDP rules to learn receptive fields in a model of the primary visual cortex taking account of lateral excitation)

René Larisch

Wed, 10. 2. 2016, Room 1/336

To understand the basic mechanisms of the neuronal processing, it is necessary to use models which are biologically plausible and show efficient performance in network simulations. Spike-Timing-Dependent-Plasticity rules seem to be a good choice. In this thesis an STDP model, where the weight development is based on the membrane potential, is used to learn receptive fields of V1. It was modified to be more biologically plausible, while the ability to learn receptive structures should be preserved. Furthermore, the ability to learn excitatory lateral connections and their effect on the neuronal activity were studied.

Inference statistics in neuroscience: their application and interpretation

Henning Schroll

Thu, 17. 12. 2015, Room 1/367

I will give an introduction to parametric and non-parametric statistics as commonly used in neuroscientic reports. Core concepts of different statistical approaches will be portrayed; the issue of choice among measures will be tackled. Finally, I will highlight, how to (not) interprete various statistical results.

Improving the robustness of Active Appearance Models for facial expression recognition

Lucas Keller

Wed, 16. 12. 2015, Room 1/336

Active Appearance Models are a popular method to detect facial expressions in an image or video. While there exists a very efficient algorithm to fit an AAM to an image, it still has some deficits in terms of robustness. This presentation discusses these problems and presents solutions implemented for my thesis.

Classification of motion of a humanoid robot while walking

Saqib Sheikh

Wed, 2. 12. 2015, Room 1/336

The goal of the master thesis was to find a classification technique that is accurate, reliable and easily implemented. This thesis suggests three techniques that were fully implemented and results documented. The techniques used for classification are Dynamic time warping algorithm, Fourier fast transform with random forest algorithm and Fourier fast transform features with decision trees algorithm.

A unified divisive-inhibition model explaining attentional shift and shrinkage of spatial receptive fields in middle temporal area

Alex Schwarz

Wed, 25. 11. 2015, Room 1/336

Attention reshapes receptive fields in area MT based on the locations of attended and unattended stimuli. There are several models explaining attentional shift, shrinkage, expansion, exitation and suppression individually. A divisive-inhibition model will be presented unifying all those attentional effects in one biologically plausible model, relying on data of Anton-Erxleben et al. (2009) and Womelsdorf et al. (2008).

Simulation eines virtuellen Agenten zum Finden von ergonomisch optimalen Arbeitsabfolgen

Sergej Schneider

Wed, 11. 11. 2015, Room 1/273

Ob ein Reinforcement Learning Agent eine Lösung für ein Problem findet und wie gut diese Lösung dann sein wird, hängt stark von den gewählten Parametern ab. In meiner Arbeit habe ich die Auswirkungen verschiedener Parameter auf das Lernverhalten eines Q-Learning bzw. eines Q(λ)-Learning Agenten im Smart-Virtual-Worker-Framework getestet. Dabei wurden sowohl zeitliche als auch ergonomische Kriterien in einem Transportszenario untersucht.

Entwicklung einer virtuellen Versuchsumgebung zur experimentellen Untersuchung von Raumorientierung und visueller Aufmerksamkeit

Sascha Jüngel

Wed, 28. 10. 2015, Room 1/208A

Ein visuelles System erhält so viele sensorische Daten, das eine clevere Fokussierung auf das Wesentliche nötig ist. Diese Präsentation beschäftigt sich mit der Erstellung einer virtuellen Umgebung mit verschiedenen Szenarien (unter anderem einem Memory-Spiel), um diesen noch nicht vollständig verstandenen Prozess der Aufmerksamkeit in Zukunft besser erforschen zu können. Dafür führt ein virtueller Agent verschiedene Aufgaben mit den Schwerpunkten des aufmerksamkeitsbasierten Beobachtens mehrerer Objekte, der Objekterkennung und dem Erinnern an eine Position im Raum aus.

Deep Learning - Beating the ImageNet Challenge

Haiping Chen

Wed, 14. 10. 2015, Room 1/336

Deep Learning is now a popular algorithm model in machine learning, it has great capability in visual recognition tasks. A competition of recognition tasks which is called “Large Scale Visual Recognition Challenge” is taken place every year to find out the best approach of the year solving the given tasks. This presentation will introduce the concept of deep learning model and a illustration of “Large Scale Visual Recognition Challenge”. A learning model called “GoogLeNet” will be analyzed, to find out why and how this learning model is the champion of the “Large Scale Visual Recognition Challenge 2014”

Intrinsically Motivated Learning of a Hierarchical Collections of Skills

Marcel Richter

Tue, 8. 9. 2015, Room 1/336

Humans and other animals often engage in activities for their own sakes rather than as steps toward solving practical problems. Psychologists call these intrinsically motivated behaviors. What we learn during intrinsically motivated behavior is essential for our development as competent autonomous entities able to efficiently solve a wide range of practical problems as they arise. A way to achieve this intrinsic motivated behaviors in the machine learning framework will be presented.

Lernen von zeitlich-dynamischen rezeptiven Feldern in einem Modell des primären visuellen Cortex

Michael Göthel

Tue, 25. 8. 2015, Room 1/368

Ein bestehendes Modell des primären visuellen Cortex wurde erweitert, um die Möglichkeiten des Erlernens von dynamischen rezeptiven Feldern (RFs) zu untersuchen. Hierzu wurden die Lateral Geniculate Nucleus (LGN) Neuronen mit einem räumlichen und zeitlichen RF ausgestattet. Außerdem wurde die Inhibition der Neuronen durch inhibitorische Interneuronen realisiert. Die Veränderungen des Modells wurden evaluiert und auf die E?ektivität untersucht.

Autonomous development of disparity tuning and vergence control for the iCub

Patrick Köhler

Tue, 18. 8. 2015, Room 1/336

Robustness against negative external and internal influences is a desirable feature in A.I. driven robotic systems. This presentation explores the functionality of an intrinsically motivated approach for a self-calibrating binocular vision system on basis of the iCub robot. The core principles of the efficient coding hypothesis and reinforced learning and their implementation in the model are explained and the test results presented.

Approximation of Kernel Support Vector Machine using Shallow Convolutional Neural Networks
(Masterverteidigung)

Jekin Trivedi

Thu, 30. 7. 2015, Room 1/336

This thesis investigates the problem of efficiently approximating Kernel Support Vector Machines at test time. The classification algorithms have, especially when executed repeatedly, a high computational complexity. In addition, kernel-methods are often hard to understand, given the unknown distribution of the data in feature space. The proposed remedy to this situation, mimicking such classifiers with shallow neural networks, or, in the case of image data, shallow convolutional neural networks. This is done by approximating the classification function of the Support Vector Machine. In addition, the shallow (convolutional) neural networks give an easier insight into the functioning of the classification algorithm c.f. kernel-based methods. We present compelling results on the MNIST, CIFAR-10 and STL-10 dataset.

ANNarchy 4.5: what's new?

Julien Vitay

Tue, 23. 6. 2015, Room 1/336

After a small reminder of the main ideas behind the neural simulator ANNarchy (Artificial Neural Networks architect), the new features introduced in ANNarchy 4.5 will be presented: monitoring, structural plasticity, reporting, multiple networks, parallel simulations, early stopping...

A recurrent multilayer model with Hebbian learning and intrinsic plasticity leads to invariant object recognition and biologically plausible receptive fields.

Michael Teichmann

Tue, 9. 6. 2015, Room 1/336

We developed a model of V1 and V2 based on anatomical evidence of the layered architecture, using excitatory and inhibitory neurons where the connectivity to each neuron is learned in parallel. We address learning by three different mechanisms of plasticity: intrinsic plasticity, Hebbian learning with homeostatic regulations, and structural plasticity.

Long Short Term Memory Networks - a solution for learning from input data showing significant time lags

Simon Kern

Tue, 2. 6. 2015, Room 1/336

Es geht um den Aufbau und Funktionsweise von LSTM Netzwerken. Ich werde diese Sorte Netzwerke mit MLPs vergleichen und versuchen zu erklären, warum LSTM besser für temporale Probleme geeignet ist. Anhand eines Beispiels werde ich die Details erläutern, um abschließend mit der Diskussion über gelöste Probleme, Möglichkeiten und Grenzen von LSTM Netzwerken zu schließen. Der Vortrag wird in Englisch sein.

Efficient learning of large imbalanced training datasets for support vector machines

Surajit Dutta

Tue, 19. 5. 2015, Room 1/336

This Master thesis concentrates mainly on supervised learning. We are provided an adas dataset (provided by Continental AG) containing a number of video recordings. In the context of this thesis we are only interested in the pose of a pedestrian, i.e. whether the pedestrian is facing to the front, to the back (frontal/posterior positioning), to the left or to the right (lateral positioning). Provided with this information, this thesis investigates the task of learning well-performing and well-generalizing detectors for both subclasses. In the provided dataset, the lateral class is of significant lower cardinality than the frontal/posterior class. Such an imbalanced training dataset usually has a negative impact on detector learning in case the detector is ought to perform similar on both classes and hence we focus on learning detectors that overcome the imbalance in the training dataset.

Exploring biologically-inspired mechanisms that allow humanoid robots to generate complex behaviors for dealing with perturbations while walking

Tran Duy Hoa

Tue, 21. 4. 2015, Room 1/368

Human reacts against perturbations while walking by doing a sequence of movements. Reaction movements help human to recover the stabilization of the body by re-posing postures. In case of the fall is unavoidable, reaction movements may help human to have a well fall instead of a bad fall. Inspired from motor sequence learning in primate brain, the proposal aims exploring new mechanisms that help humanoid robots keeping away from falling in deal with perturbations while walking. This will be done through the dynamic sensory-motor interaction with the environment, and a human teacher if needed. In that, sequence of reaction movements are acquired, maintained, and executed through a motor sequence learning architecture that are composed based on the inspiration from the role of brain cognitive loops in motor and sequence learning. Basal ganglia will play the role of the regulator in selection and performance context-based appropriate sequence of reaction movements. Dopamine neurons will play the role of the reinforcement learning mechanism in acquiring sequence of reaction movements. Movements of humanoid robot's legs are actually driven by the Multi-layered Multi-pattern central pattern generator (CPG) to generate sequence of patterns under the regulation of descending control signals from the motor sequence learning architecture, and ascending control signals from sensory feedback. The proposal also refers to the Dynamic field theory (DFT) that models sequence of movements mathematically as the evolution of neuronal populations under the interaction between internal and external forces continuously in time. The humanoid robot (NAO) with this mechanism is expected to be able to learn to perform sequence of reaction movements to self-recover the stabilization of body and avoid the fall in deal with perturbation while walking.

Evolution eines künstlichen neuronalen Netzes zur Steuerung eines Agenten in einem simulierten Umfeld.

Andy Sittig

Mon, 23. 3. 2015, Room 1/336

Der Vortrag gibt eine Einführung in die Grundlagen der Neuroevolution eines Agenten in einem einfachen 2D Actionspiel. Dabei wird auf die Spielumgebung und deren Wahrnehmung, sowie die daraus abgeleitete Konfiguration des künstlichen neuronalen Netzes und evolutionären Algorithmus eingegangen. Abschließend werden die Ergebnisse des Prozesses besprochen und kritisch betrachtet.

Facial feature detection

Lucas Keller

Wed, 18. 3. 2015, Room 1/336

Facial feature detection is an important branch of research in computer vision and can be used for e.g. face recognition, human computer interaction or the classification of facial expressions. Although its an easy task for a human to detect facial features, its still difficult to perform for a computer. This presentation gives an overview over some methods to extract these features from images and further explains and demonstrates the usage of Active Appearance Models.

Erstellung eines Moduls für die Online Performance Messung auf paralleler Hardware.

Leander Herr

Wed, 4. 3. 2015, Room 1/367

Immer komplexere Berechnungen stellen wachsende Herausforderung an die Hardware. Dieser Komplexität wird häufig mit der Verwendung von paraller Hardware begegnet, sowohl Multi-Core als auch GPUs. Für eine effiziente Berechnung müssen dabei verschiedene Umgebungsparameter eingestellt werden (z. B. Anzahl von Threads). Diese Parameter können jedoch nur mit entsprechendem Aufwand analysiert werden. Zur Unterstützung dieses Prozesses werden oft Profiling-Tools (TAU, NVVP oder Vampir) für eine Offline-Analyse verwendet. Ziel der Arbeit war es eine API für die Online-Zeiterfassung von parallelen Code zu entwickeln.

A neurocomputational systems-level model of affective visual attention.

Rene Richter

Wed, 18. 2. 2015, Room 1/336

In order to simulate emotional attention, a biologic realistic system-level model has been developed which simulates how emotions could influence our visual attention system during the presentation of objects and facial features. The system-level model is composed of two different models, a visual attention model for a simulation of attention on the visual processing pathway, and an amygdala model that represents the emotional influence on this pathway.

Fast approximation of deep neural networks.

Jekin Trivedi (With Continental AG)

Wed, 11. 2. 2015, Room 1/208a

Deep convolutional neural networks have become the state of the art for many computer vision applications. During my internship we have focussed on methods of this field, such as sparse (convolutional) autoencoders and denoising autoencoders. The talk will focus on these topics while giving an outlook on the research I will undertake in my master thesis, which involves the approximation of deep models with shallower and faster models.

Revealing the impairments of thalamic lesions using a neurocomputational model of saccadic suppression of displacement.

Christina Axt

Thu, 5. 2. 2015, Room 1/375

The impression of a stable world is mostly taken for granted in everyday life. We perceive our environment as a unified, continuous panorama, always present and seamless in its permanence, although our eyes move the entire time to sample it. The underlying neural mechanisms to ensure this visual stability, however, do not run as seamlessly as our perception. In this bachelor thesis, the main focus laid on revealing the impairments of patients with thalamic lesions, and on identifying the sources affecting their perception of the visual environment. Furthermore, it was examined whether there were similarities amongst the patients that cause similar impairments.

Hippocampal place-cell sequences support flexible decisions in a model of interactions between context memory and the cognitive map. / Praktikumsabschluss Konstantin Willeke.

Lorenz Goenner / Konstantin Willeke

Wed, 28. 1. 2015, Room 1/336

Hippocampal place-cell sequences observed during awake immobility have been found to represent either previous experience, compatible with a role in memory processes, or future behavior, as required during planning. However, a unified account for the sequential organization of activity is still lacking. Using computational methods, we show that sequences corresponding to novel paths towards familiar locations can be generated based on interactions between two types of representations: First, a stable map-like representation of space, represented by a continuous attractor model of hippocampal area CA3. Second, context-dependent goal memory, given by context-specific activity in entorhinal cortex (EC) and reward-modulated plasticity at EC synapses onto dentate granule cells. The model contributes (1) an account of goal-anticipating place cell sequences in open-field mazes, (2) an explanation for the role of sharp wave-ripple activity in spatial working memory, and (3) a prediction for the involvement of spatial learning in the development of place-cell sequences.

In his research internship, Konstantin Willeke has investigated a potential extension of the model to include learning the environmental topology in the recurrent connections between place cells. He will present additional simulation results.

Entwicklung eines parallelen genetischen Algorithmus zur Maschinenbelegungsplanung.

Martin Wegner

Wed, 21. 1. 2015, To be announced

Genetische Algorithmen gelten als leistungsfähige Lösungsverfahren für praxisrelevante Planungsprobleme aus den Bereichen Produktion und Logistik. Ziel der Arbeit ist die Entwicklung eines genetischen Algorithmus mit einem regionalem Modell und der Vergleich der Leistungsfähigkeit dieses Konzeptes mit anderen Lösungsverfahren. Die Besonderheit dieser Form der Parallelisierung ist die Nutzung verschiedener verknüpfter Populationen.

Eine mathematische Handlungstheorie mit Anwendung auf die Mikroökonomie.

Radomir Pestow

Wed, 21. 1. 2015, Room 1/336

Es werden ein agentenbasiertes Modellierungswerkzeug und Begriffsapparat vorgestellt mit welchem dynamische Prozesse, inbesondere materielle, psychologische und soziale Prozesse, exakt beschrieben werden können. Dieser Apparat soll dann beispielhaft auf zwei wirtschaftliche Modelle angewandt werden.

Using a convolutional neural network for cancer detection - the computational model

Arash Kermani

Wed, 17. 12. 2014, Room 1/336

In this talk, the paper: 'Mitosis Detection in Breast Cancer Histology Images with Deep Neural Networks' (by Dan C. Cires an, Alessandro Giusti, Luca M. Gambardella, Juergen Schmidhuber) will be discussed. This second part will focus on the computational model.

Reinforcement Learning in Multi Agent Systems

Joseph Gussev

Wed, 10. 12. 2014, Room 1/336

The task of a Q-Learning agent is to learn which actions he should use to recieve the maximum of a certain numerical reward. In this presentation an overview will be given on how agents in Multi Agent Systems can learn and work together or be concurrent with other agents in the same system. There is also a solution presented, which focuses on the 'Credit Assignment Problem' occuring when agents work together on a joint task.

Using a convolutional neural network for cancer detection

Arash Kermani

Wed, 26. 11. 2014, Room 1/336

In this talk, the paper: 'Mitosis Detection in Breast Cancer Histology Images with Deep Neural Networks' (by Dan C. Cires an, Alessandro Giusti, Luca M. Gambardella, Juergen Schmidhuber) will be discussed. The focus will be on the deep learning method used for cancer detection.

A unified system-level model of visual attention and object substitution masking

Frederik Beuth

Wed, 5. 11. 2014, Room 1/336

The phenomena of visual attention (Hamker, 2005, Cerebral Cortex) and object substitution masking (OSM; DiLollo and Enns, 2000) are supposed to rely on different processes. However we will show that OSM can be accounted by well-known attentional mechanisms within a unified model.

SVM balancing and object detection

Dr. Patrick Ott (Continental AG) and Surajit Dutta

Mon, 27. 10. 2014, Room 1/336

Dr. Patrick Ott will present an overview of the research conducted at Continental AG on object detection. Surajit Dutta will then present the topic of his master thesis on balancing training sets for support vector machines and its application to the detection of rare events.

Implementation of bar learning based on a triplet STDP Model by Clopath et al.

Rene Larisch

Wed, 22. 10. 2014, Room 1/336

Models with Spike Timing Dependent Plasticity (STDP) are based on the temporal offset between the spikes of the post- and presynaptic neuron. We realized a STDP Model with triplet characteristic proposed by Clopath, Büsing, Vasilaki & Gerstner (2010) in a network with 20 excitatory Neurons and 20 inhibitory neurons to learn bars and to study the model dynamics, like the homeostatic mechanism, and the effect of the triplet property.

Role of competition in robustness against loss of information in feature detectors.

Arash Kermani

Tue, 26. 8. 2014, Room 1/336

In this talk different methods of competition among units of feature detectors will be explained. As a new criterion for effectiveness of competition, robustness of classification under loss of information will be discussed.

Motion detection and receptive field dynamics in early vision processes.

Tobias Höppner

Thu, 19. 6. 2014, Room 1/336

Motion detection and processing is a striking feature of the visual system. Although higher visual areas play an important role in motion processing, the basic detection already begins in retinal ganglion cells. A model based of simple ode's will be presented and compared to existing approaches.

Energy and execution time models for an efficient execution of scientific simulations

Jens Lang

Thu, 5. 6. 2014, Room 1/336

The energy efficiency of computation gains in importance in the area of scientific computing with the energy consumption having a significant impact on the cost of operating big clusters. The thesis presented in this talk uses model-based autotuning for adapting the execution of scientific simulations to the underlying hardware and thus making it more efficient. The term 'efficiency' is regarded here considering both, execution time and energy.

The role of the subthalamic nucleus globus pallidus loop in habit formation.

Javier Baladron Pezoa

Thu, 15. 5. 2014, Room 1/336

The role of the feedback loop between the subthalamic nucleus and the external section of the globus pallidus in the generation of abnormal oscillations in Parkinson's disease has been widely studied both theoretically and experimentally but its role during the learning of stimulus-action association is unknown.
In this presentation I will describe an extension of our spiking model of the basal ganglia that provides new theoretical insights about the function of the subthalamic nucleus during habit formation. This new approach was developed by including the connections between the STN and the GPe that were not present in previous models.
The new network predicts that the STN plays an important role during reverse learning by maintaining the inhibition of alternative actions that was learned by the D2 cells in the striatum.

Entwicklung eines parallelen genetischen Algorithmus zur Maschinenbelegungsplanung (Diplomarbeit Themenvorstellung)

Martin Wegner

Thu, 8. 5. 2014, Room 1/336

Genetische Algorithmen gelten als leistungsfähige Lösungsverfahren für praxisrelevante Planungsprobleme aus den Bereichen Produktion und Logistik. Ziel der Arbeit ist die Entwicklung eines genetischen Algorithmus mit einem Inselmodell und der Vergleich der Leistungsfähigkeit dieses Konzeptes mit anderen Lösungsverfahren. Die Besonderheit dieser Form der Parallelisierung ist die Nutzung verschiedener verknüpfter Populationen.

A Possible Role of the Basal Ganglia in the Spatial-To-Temporal Transformation in Saccadic Eye Movements: A Computational Model

Abbas Al Ali

Wed, 16. 4. 2014, Room 1/368a

A new model system consisting of a model of the superior colliculus, a model of the basal ganglia, and a saccade generator, will be presented. The superior colliculus is known to be involved in controlling saccades. Saccade vectors encoded as activity on the SC motor map are transformed into temporal code which results in the stereotyped dynamic saccadic behaviour. Based on the proposed model a colliculo-thalamo-basalganglio-collicular oculomotor loop, may play a role in generating the temporal profiles of the SC neurons addressing a role of the basal ganglia in controlling saccades dynamics.

Implementierung und Optimierung eines Trace Learning Modells auf einer Coil100 Datenbank (Bachelor Verteidigung).

Daniel Buchholz

Thu, 10. 4. 2014, Room 1/336

Im folgenden Vortrag werden kurz die Grundlagen eines Tracelearning-Modells und seiner Anwendung auf die Coil100- Datenbank beleuchtet. Danach wird gezeigt, wie mithilfe einer geeigneten Klassifizierung Erkennungsraten von weit über 90% erreicht werden können und auf welche Art und Weise die Effektivität des Modells dargestellt werden kann. Zum Schluss werden weitere Möglichkeiten und Probleme dieses Modells präsentiert.

Der Einfluss von Emotionen auf Reinforcement Prozesse in einer virtuellen Realität

Winfried Lötzsch

Tue, 25. 2. 2014, Room 1/336

Autonome Agenten und deren Steuerungsmechanismen können in ihren Entscheidungsprozessen von der Umgebung oder inneren Zuständen beeinflusst werden. Auf Grundlage der Besonderen Lernleistung zu diesem Thema wird die konzeptionelle Erweiterung eines bestehenden Modells beschrieben. Im Mittelpunkt steht auch die grafische Simulation des Agenten und seiner Umwelt.

Implementierung von Zustandsabstraktionsmechanismen für Reinforcement Learning Agenten (BA Verteidigung)

Fabian Bolte

Wed, 12. 2. 2014, Room 1/368

Im Projekt 'Smart Virtual Worker' wird ein Werkzeug für die Simulation und Evaluation von Arbeitsprozessen entwickelt. Die Erstellung einer Simulation erfordert einen hohen Arbeitsaufwand bzw. Expertenwissen. Um die benötigte Zeit für die Erstellung einer Simulation zu reduzieren, soll der virtuelle Arbeiter mit der Fähigkeit Handlungsfolgen autonom zu lernen ausgestattet werden. Vorgestellt wird eine aktuelle Agentenvariante, die im Rahmen der Bachelorarbeit um Zustandsabstraktionsmechanismen erweitert wurde.

Dopamine ramps up?

Julien Vitay and Helge Dinkelbach

Tue, 14. 1. 2014, Room 1/336

A recent experiment by (Howe et al., Nature 2013) showed that the dopamine concentration in the striatum increases linearly during simple T-maze experiments. We will collectively discuss the functional implications of this finding and present a preliminary explanation by Samuel Gershman using quadratic functions of the distance to the goal in a modified TD algorithm. We will end with a discussion on the proposed new features of ANNarchy 4.0.

Entwicklung eines Software-Moduls zur Implementierung der kortikalen Vergrößerung (Bachelorverteidigung)

Andreas Heinzig

Tue, 17. 12. 2013, Room 1/336

In dieser Bachelorarbeit geht es um die Implementierung eines Software-Moduls zur Realisierung der kortikalen Vergrößerung. Die kortikale Vergrößerung ist ein Effekt der bei der Weiterleitung und Bearbeitung von Bildinformationen im Gehirn auftritt. Die genaue Aufgabenstellung beinhaltet die Entwicklung in den Test eines Software-Moduls welches in C++ programmiert ist und eine parametrisierbare, kortikale Vergrößerung implementiert. Es soll 1- und N-kanalige Bilder einzeln und als Bildfolgen transformieren und dabei eine oder mehrere geeignete Interpolationsmethoden verwenden. Der Aufruf als Standalone-Programm soll möglich sein, wobei binäre Dateien als Eingabebilder akzeptiert und Ausgabebilder wieder als solche ausgegeben werden. Allerdings soll das Modul auch in Verbindung mit einer virtuellen Umgebung funktionieren. Es soll eine bidirektionale Transformation zwischen dem visuellen und dem kortikalen Raum durchführen können und auch die Möglichkeit parametrische, geometrische 2D-Figuren (Punkte, Linien, Kreise,...) zu transformieren soll gegeben sein.
Bewerkstelligt wurde dies durch zwei Demo-Programme. Zum einen ein Standalone-Programm welches eine binäre Datei einliest, das Bild transformiert und als binäre Datei wieder ausgibt. Dabei wird das Bild mittels Matlab in eine binäre Datei geschrieben und das Ausgabebild aus der Ausgabedatei wieder auslesen und angezeigt. Zum anderen wurde das Modul als Klasse in die VR-Demo (virtuelle Realität) implementiert.

Timing and expectation of reward: a model of the afferents to VTA

Julien Vitay

Tue, 10. 12. 2013, Room 1/368a

As reflected by the firing patterns of dopaminergic neurons in the ventral tegmental area, temporal expectation is an important component of Pavlovian conditioning. Predicting when a reward should be delivered after the onset of a predicting cue allows to both reduce the associated surprise and avoid over-learning, and to estimate when one should be disappointed by the omission of an expected reward. Several models of the dopaminergic system during conditioning exist, but the substrate of temporal learning is rather unclear. We propose a neuro-computational model of the afferent network to the ventral tegmental area, including the lateral hypothalamus, the pedunculopontine nucleus, the amygdala, the ventromedial prefrontal cortex, the ventral basal ganglia (including the nucleus accumbens and the ventral pallidum), as well as the lateral habenula and the rostromedial tegmental nucleus. Based on a plausible connectivity and realistic learning rules, this neuro-computational model reproduces several experimental observations, such as the progressive cancellation of dopaminergic bursts at reward delivery, the appearance of bursts at the onset of reward-predicting cues or the influence of reward magnitude on activity in the amygdala and ventral tegmental area. While associative learning occurs primarily in the amygdala, learning of the temporal relationship between the cue and the associated reward is implemented as a dopamine-modulated coincidence detection mechanism in the nucleus accumbens.

A spiking neural network based on the basal ganglia functional anatomy

Javier Baladron Pezoa

Tue, 3. 12. 2013, Room 1/336

In this talk I will present a new spiking neural network whose connectivity is defined following anatomical descriptions of the different cortico thalamic pathways. The network is capable of learning action-response association using dopamine modulated spike timing dependent plasticity. The functionality of each pathways is defined following the paper by Schroll et al (2013) and I will emphasize the different between this approach and previous spiking models of the basal ganglia.

Locomotive Brain Model for Humanoid Robots

Dr. John Nassour

Tue, 12. 11. 2013, Room 1/336

More information

Advances in Neuro-robotics

Dr. Andrea Soltoggio

Tue, 5. 11. 2013, Room 1/368a

More information

Tourette Syndrome - State of the current research and the generation of hypotheses with a computer model of the basal ganglia.

Karoline Griesbach

Tue, 5. 11. 2013, Room 1/336

The Tourette Syndrome (TS) is a developmental neurological disorder. Research discusses different reasons for the TS, for example aberrant patterns of neuron activity, for instance in neurons of the striatum or abnormalities of neurotransmitters such as in dopamine. In my Bachelor thesis I summarize the status of research and generate hypotheses with the help of a computer model based on the model from Hamker, Schroll & Vitay (2013). The goal is to show the characteristics of TS - especially the tics - using a task-switch-paradigm which the model has to solve.

Scalable item recommendation in big data and search indexes (diploma thesis defense)

Tolleiv Nietsch

Tue, 29. 10. 2013, Room 1/336

The integration of search technologies with machine learning technology provides various ways to personalize search results and to improve search quality for the users. In my diploma thesis, two possible ways to reach this goal have been described along with the requirements to reach scalability. It compares a web-service based solution with an integrated matrix factorization approach and describes the architectural requirements to build both systems with OpenSource tools like Apache Mahout and Apache Solr. In this presentation, I'll sum up work from my diploma thesis, present the results I found and the methods I used.

Category learning by using a visual motor policy including eye movements

Robert Blank

Wed, 17. 7. 2013, Room 1/336

The Basal Ganglia (BG) plays an important role in cognitive processes like decision making and eye movement. In this work, the controlling of saccades is investigated. A model has to solve an abstract visual categorization task and has additionally to learn how to acquire visual information. An external process executes randomly saccades, hence the model can gather reliable information through saccades to diagnostic features. Thus, the model has to learn to wait for these saccades which is possible through the interplay of the direct, indirect and hyperdirect pathway of the Basal Ganglia.

Intrinsically Motivated Action-Outcome Learning and Goal-Based Action Recall

Orcun Oruc

Tue, 16. 7. 2013, Room 1/336

I will present the article 'Intrinsically motivated action-outcome learning and goal-based action recall: A system-level bio-constrained computational model' by Baldassarre, Redgrave, Gurney and colleagues (2013). This model investigates the influence of intrinsic motivations (based on novelty and curiosity) on the learning of a rewarded exploration task (extrinsic motivation). Its is composed of three cortico-basal ganglia loops, separately processing arm movements, saccadic generation and outcome evaluation but coordinated by dopaminergic modulation.

Intrinsic plasticity

Lucas Keller

Tue, 9. 7. 2013, Room 1/336

Coming soon...

Inhibition through inhibitory interneurons as replacement for lateral inhibition

Michael Göthel

Tue, 2. 7. 2013, Room 1/336

The lateral inhibition between two or more excitatory neurons should be replaced through a new layer of inhibitory neurons, which express the same behavior of the network as using lateral inhibition. Using the results of some experiments with two types of example networks, I will discuss the question if that is even possible and how to deal with some problems like choosing the right limitation of the weights (alpha value) in the network.

Sparse coding and non-negative matrix factorization

Dr. Steinmüller

Tue, 25. 6. 2013, Room 1/336

The talk covers the topics 'Sparse solutions of systems of equations', 'Sparse modeling of signals and images' and 'Non-negative matrix factorization as the generalization of PCA and ICA'.

Biological foundations for computational models of structural plasticity

Maxwell Shinn

Tue, 18. 6. 2013, Room 1/B006

Structural plasticity is the ability of a neural network to dynamically modify its connection patterns by changing the physical structure of the neurons. In this presentation, I discuss what biological experiments have taught us about structural plasticity, and how these results can be used to build more accurate computational models of plastic neural networks.

Animation and simulation of muscles skeleton models - examples of AnyBody

H. Niemann

Tue, 11. 6. 2013, Room 1/336

Each human being has over 200 bones, over 300 joints and more than 600 muscles (Gottlob, 2009). Today there are only certain assumptions and simplifications to model them. One goal of muscles skeleton models is visualisation/animation for vivid impressions how muscles work (also in dysfunctional work like Parkinson). The talk presents an application-orientated introduction to muscle skeleton models using the AnyBody Modelling system. Examples of animation and simulation are shown for movements of the lower extremities. An integrated discussion refers to current possibilities and limits of muscles skeleton models. The majority of researchers use the method of inverse kinematics as inputs of muscles skeleton models. Even though these are still dreams of the future, first steps are shown by introducing motoneurons as inputs. In current research a full alternative replacement of inverse kinematics is - not yet - available.

ANNarchy 3.1 - feature discussion meeting

Helge Dinkelbach

Tue, 21. 5. 2013, Room 1/336

It is planned to introduce a new interface and new features in the neuronal network simulator, ANNarchy. This meeting is intended to present the current version of ANNarchy and to discuss ideas of the next version (ANNarchy 3.1).

Modeling the SSD task of a patient with lesioned thalamus

Julia Schuster

Tue, 14. 5. 2013, Room 1/336

While observing the environment, our eyes move to different points of attention several times each second. Nevertheless, we perceive the visual world as stable and as a result small displacements of visual targets cannot be detected well during such eye movements ? a phenomenon called saccadic suppression of displacement (SSD). Recently, Ostendorf et al. presented data of a patient with a right thalamic lesion showing a bias towards perceived backward displacements for rightward saccades in the SSD-task. To better understand the nature of the behavioral impairment following the thalamic lesion we applied a computational model developed by Ziesche and Hamker to simulate the patient. In this presentation, I will introduce the used model as well as the results of the simulation.

Learning Categories by an interaction between fast and slow plasticity

Francesc V. Escudero

Tue, 7. 5. 2013, Room 1/336

It will be explain how we think category learning happen in our brain. The aim of the presentation is to introduce the background of the project I will start working in.

A computational model of hippocampal forward replay activity at decision points

Lorenz Goenner

Tue, 30. 4. 2013, Room 1/336

During rodent navigation, hippocampal place cells are activated in a sequence corresponding to the animal's trajectory. A recent topic of interest is neural activity corresponding to the replay of sequences of place cells during sleep and awake resting, and its possible link to learning and memory. I will present a model for the generation of forward replay activity in a T-maze. It becomes evident that correct replay requires the disambiguation of overlapping sequences.

Combining scalable Item-Recommendation and Searchindexes

Tolleiv Nietsch

Tue, 16. 4. 2013, Room 1/336

Running internet services today often comes with the requirements to make big amounts of data available to a large crowd of users. Having flexible search systems to make sure information can be found easily is one step towards that. Collaborative recommendation algorithms could be used to enrich and personalise search results and to raise the possible gain for single users. The presentation will give an overview of some of the theoretical foundations, cover the work done so far in the related diploma thesis and explain the chosen system-setup along with the related OpenSource software tools.

Numerical analysis of large scale neural networks using mean field techniques

Javier Baladron-Pezoa (Javier Baladron Pezoa. NeuroMathComp Project Team, INRIA)

Tue, 12. 2. 2013, Room 1/336

In the first part of this talk I will introduce you to a new mean field reduction for noisy networks of conductance based model neurons. This approach allow us to describe the dynamics of a network by a McKean-Vlasov-Fokker-Planck equation. On a second part of this talk I will show you several simulations whose objective was to study the behav- ior of extremely large networks (done with a GPU cluster). Read more

Integration von kognitiven Modellen zur Entwicklung eines virtuellen Agenten

Michael Schreier

Mon, 28. 1. 2013, Room 1/336

Der Vortrag stellt die neuronale Implementierung eines kognitiven Agenten in einer Virtuellen Realität dar. Im Rahmen eines Praktikums wurden neuronale Modelle verschiedenster Gehirnareale (Visual Cortex, Frontal Eye Field, Basal Ganglia) zusammen integriert um einen kognitiven Agenten zu simulieren. Es soll so eine Gehirnsimulation für zukünftige Forschungen bereitgestellt werden.

A neurobiologically founded, computational model for visual stability across eye movements

Arnold Ziesche

Thu, 3. 1. 2013, Room 1/336

ANNarchy 3.0. Presentation of the user-interface.

Julien Vitay and Helge Dinkelbach

Mon, 10. 12. 2012, Room 1/208A

ANNarchy 3.0 is the new version of our neural simulator. The core of the computations is written in C++, with optionally a parallel optimization using openMP. Its structure has changed quite a lot since ANNarchy 1.3 or 2.0, so newbies as well as experienced users may equally benefit from the presentation.
The major novelty is that the main interface is now in Python, thanks to the Boost::Python library. It allows to easily define the structure of a network, visualize its activity and find the correct parameters through a high-level scripting language. The basics of this interface will be presented.

Cortico-basal ganglia loop and movement disorders

Atsushi Nambu (Division of System Neurophysiology, National Institute for Physiological Sciences, Japan)

Mon, 3. 12. 2012, Room 1/336

Modeling of stop-signal tasks using a computational model of the basal ganglia.

Christian Ebner

Mon, 12. 11. 2012, Room 1/336

Selecting appropriate actions and inhibiting undesired ones are fundamental functions of the basal ganglia. An existing computational model of the BG is considered to learn the inhibition of premature decisions using a suitable task. The possible influence of the hyperdirect pathway is to be illustrated.

Spike timing dependent plasticity and Dopamine in the Basal Ganglia.

Simon Vogt

Wed, 7. 11. 2012, Room 1/336

Our brain's Basal Ganglia have historically been proposed for a wide number of brain functions including sensory and motor processing, emotions, drug addiction, decision making, and many more. They are also the main area involved in neurological diseases like Parkinson's Disorder or Chorea Huntington. Current clinical treatments tend to only delay symptoms pharmaceutically or through surgery for a limited time, and have seemed to follow a trial-and-error approach to understanding the basal ganglia. In order to give a better explanation for neurological diseases and the many side effects that today's clinical treatments cause, we should try to understand the basal ganglia's network dynamics, learning paradigms, and single-spike timing features from an information processing perspective.
In my talk, I will re-examine the basis for how high-level reinforcement learning is often assumed to be implemented within the spiking networks of the basal ganglia's Striatum, and show how lower-level dopaminergic modulation of synaptic transmission may guide higher-level network learning while also displaying instant responses in neuronal spiking activity.
By improving our understanding of the neural spike code of the striatum and other basal ganglia nuclei to a depth where we can truly decode multisite electrophysiological recordings of spiking neurons, we will be able to devise better, more informed treatments of typical basal ganglia disorders in the future.

The guidance of vision while learning categories - A computational model of the Basal Ganglia and Reinforcement Learning

Robert Blank

Mon, 5. 11. 2012, Room 1/336

The Basal Ganglia (BG) plays a very important role in cognitive processes like decision making or body movement control. A biologically plausible model of BG solving an abstract visual categorization task will be presented in this work. The model learns to classify four visual input properties into two categories while only two properties are important, called diagnostic features. The results show, that the model is not able to differentiate between diagnostic features and unimportant ones. Finally it just memorizes the presented input and shows only weak abilities of generalization.

Fear conditioning in the Amygdala

René Richter

Thu, 5. 7. 2012, Room 1/367

How do we connect emotions with objects/situations? We will try to find out about this in the special case of fear and take a look at a computational model of the amygdala. Also we will take a closer look at one part of the amygdala, the basal amygdala, where most likely context conditioning takes place.

Learning of V1-simple cell receptive fields using spike-timing dependent plasticity

René Larisch

Thu, 7. 6. 2012, Room 1/336

Spike-timing dependent plasticity, or short STDP, represents an interesting option in the consideration of neuronal networks. This presentation will give an overview about three different views how a STDP network could be realised. The focus is taken on the main features, machanism and how these models realise the development of rezeptive fields.

Comparison of GPUs- (graphic processing units) and CPUs-implementations for neuronal models.

Helge Ülo Dinkelbach

Thu, 24. 5. 2012, Room 1/336

In the past years, the computational potential of multiprocessors (CPUs) and graphic cards (GPUs) increased. On the other hand it gets more and more complicated to evaluate which technologies are usable for a certain computation problem. The available computation frameworks (OpenCL and Cuda for GPUs; OpenMP for CPUs) are not always easy to handle in all use cases. The objective of this talk is to show how the potential of modern parallel hardware could be used for computation of neuronal models. Additional some limititions will be discussed.

Categorization - How humans learn to understand their environment and to select appropriate actions.

Frederik Beuth

Thu, 10. 5. 2012, Room 1/336

Dividing objects into categories is one of the remarkable human abilities, but its neuronal basis and its impressive fast execution is still little understood. I will introduce the neuronal foundations of all three components involved in visual-based categorization: 1) visual information, 2) categories, 3) actions, and the linkage of them together. In the main part, I will give an overview how humans learn to understand their environment and to select appropriate actions. Humans relay on at least three different systems for learning 1) motor skills, 2) concepts and 3) hypothesis. In combination, these systems result in a very powerful learning in order to select the correct response for a specific situation. From the computer science's view, this approach could be used for object recognition.

The Hippocampus - a brain structure for spatial and episodic memory

Lorenz Gönner

Thu, 3. 5. 2012, Room 1/336

Even after decades of research, the hippocampus continues to inspire researchers by a wealth of phenomena. As an introduction to the field of hippocampus research, I will provide a panoramic view of past and present topics both in behavioral and computational neurosciences. A focus will be on the role of the hippocampus in the learning of goal-directed behavior.

Basal ganglia pathways: Functions and Parkinsonian dysfunctions

Henning Schroll

Thu, 26. 4. 2012, Room 1/336

I will review functional contributions of basal ganglia pathways in reinforcement learning, also their dysfunctions in Parkinson's disease will be addressed. Using computational modeling, I will attempt an integration of functions and dysfunctions in a single analytical framework.

Motion detection and receptive field dynamics in early vision processes

Tobias Höppner

Thu, 19. 4. 2012, Room 1/336

Temporal changes in receptive field structures are a field of growing interest. The early vision processing stages are not only concerned with spatial decorrelation of visual representations but also process the temporal information conveyed by the retinal signal. Therefore changing receptive field structures as seen in physiological experiments are investigated. Notably biphasic responses are considered to form lagged and nonlagged neuronal responses which in turn are a promising cause for spatio-temporal receptive field dynamics.

An Extensible and Generic Framework With Application to Video Analysis

Marc Ritter

Thu, 12. 4. 2012, Room 1/336

This presentation gives insights into the outcomes of the project sachsMedia in the field of metadata extraction by video analysis while introducing a holistic, unified and generic research framework that is capable of providing arbitrary application dependent multi-threaded custom processing chains for workflows in the area of image processing. Read more

Runtime optimal partitioning of neural networks (Bachelor-Verteidigung).

Falko Thomale

Wed, 8. 2. 2012, Room 1/368a

This bachelor thesis investigates the parallel execution of the neural network simulator ANNarchy with the help of the OpenMP API and taking advantages of the NUMA architecture of modern computers. The NUMA architecture allows a faster access to the memory by splitting the memory for simulation into multiple partitions and let each partition run on a separate NUMA node. The modifications are tested with random neural networks and two networks with practical importance and the test results of memory placement improvements are shown and discussed.

Computational Modelling of the Oculomotor System.

Abbas Al Ali

Thu, 19. 1. 2012, Room 1/336

Previous works of our group have come out with a series of models describing the crucial role of the basal ganglia (BG) in learning rewarded tasks. BG are shown to be a central part of many cortico-BG-thalamo-cortical loops, such as the visual working memory loop and the motor loop, within which Dopamine is the reward-related learning modulator. BG are also known to play a role in controlling purposive rapid eye movements (saccades). Saccades are driven by the superior colliculus (SC) in the brain stem which has connections from many visual related cortical areas as well as from BG. We want to transfer knowledge gained by previous models and integrate it in a distributed oculomotor system that contains a BG-model, the frontal eye field, SC and some other cortical areas in oder to investigate the emergent behaviour in learning rewarded reactive and goal-guided saccadic eye movement tasks.

Schnelle GPGPU-basierte Simulation Neuronaler Netze (Master-Verteidigung)

Helge Ülo Dinkelbach

Thu, 15. 12. 2011, Room 1/336

Simulationen im Bereich Computational Neuroscience haben eine hohe Laufzeit, sind aber gleichzeitig gut parallelisierbar. Moderne Grafikkarten verfügen über eine sehr hohe Anzahl an Rechenkernen, die für die Ausführung paralleler Programme zur Verfügung stehen. In der Präsentation wird die Beschleunigung des an der Professur entwickleten Neurosimulator vorgestellt, wobei als Ansätze CUDA undObjektorientierung genutzt wurden.

Computational Model for Learning Features in Area V2

Norbert Freier

Thu, 8. 12. 2011, Room 1/336

Computer vision is often used for object recognition. But it can not handle every situation. The brain of primates outperforms every available method. Future algorithms may will reproduce the techniques of the brain. Therefore it is necessary to know how the visual system works. As result of this Studienarbei a computational model of area V2 will be presented in comparison to cell recordings and other related work.

Laufzeitoptimale Aufteilung neuronaler Netze.

Falko Thomale

Thu, 24. 11. 2011, Room 1/336

Die Simulation großer neuronaler Netze benötigt viel Rechenleistung. Der an der Professur entwickelte Neurosimulator ANNarchy nutzt bereits OpenMP für eine parallelisierte Ausführung auf Multicore-Systemen. In diesem Vortrag möchte ich dieses Vorgehen näher betrachten und eine mögliche Verbesserung der Parallelisierung darstellen, indem das zu simulierende neuronale Netz optimal auf die Recheneinheiten verteilt wird.

NNSpace - Eine generische Infrastruktur für neuronale Netze

Winfried Lötzsch

Thu, 17. 11. 2011, Room 1/336

Um moderne neuronale Netze effizient zu nutzen, sind oft komplexe Aufbauprozesse basierend auf mehreren Algorithmen und Netzstrukturen erforderlich. Eine automatisierte Generierung dieser Netze ohne Programmieraufwand war bisher nur in bestimmten Spezialfällen möglich. Die Infrastruktur NNSpace erreicht durch die Möglichkeit neuronale Netze zu kombinieren, dass viele Anwendungsfälle aus Standardbausteinen zusammengesetzt werden können. Außerdem könnte jene Kombination die Leistung und generelle Anwendbarkeit neuronaler Netze steigern. Der Vortrag stellt die Infrastruktur an einem Beispiel vor und geht auf ihre interne Funktionsweise ein.

Entwicklung eines kognitiv-emotionalen Interaktionsmodells der Amygdala und des Hypothalamus.

Martina Truschzinski

Thu, 27. 10. 2011, Room 336

Emotionen sind wichtige Bestandteile des menschlichen Bewusstseins und liefern einen wesentlichen Beitrag zur Effizienz und Leistungsfähigkeit des Gehirns. Sie beeinflussen sowohl kognitive Prozesse, wie die Wahrnehmung, die Lernfähigkeit und subjektive Bewertungen, als auch Reaktionsmechanismen, die auf Grundlage dieser kognitiven Prozesse generiert wurden. Ausgehend vom MOTIVATOR-Modell (Dranias, Gross- berg und Bullock, 2008) wird ein kognitiv-emotionales Modell auf neuen neuroanatomischen und physiologischen Erkenntnissen vorgestellt. Der Fokus liegt auf der Interaktion zwischen den Gehirnarealen der Amygdala und dem Hypothalamus.

Alterations in basal ganglia pathways impair stimulus-response learning in Parkinson's Disease: A computational model.

Henning Schroll

Thu, 13. 10. 2011, Room 336

I present a computational model of how basal ganglia pathways contribute stimulus-response learning. When introducing Parkinsonian lesions to this model, an imbalance in the activities of basal ganglia pathways arises and causes learning deficits typical for Parkinsonian patients.

Hierarchisches Reinforcement Learning

Vincent Küszter

Wed, 29. 6. 2011, Room 1/B309a

Maschinelles Lernen, zu dem auch Reinforcement Learning gehört, bildet einen wichtigen Zweig der Künstlichen Intelligenz und Robotik. Das klassische Reinforcement Learning, wie es zur Entscheidungsfindung von Soft- und Hardware-Agenten eingesetzt wird, hat jedoch mehrere Nachteile - schlechte Skalier- und Genrealisierbarkeit. Ein Ansatz von Matthew M. Botvinick, Yael Niv und Andrew C. Barto versucht diese Probleme mit Hilfe von hierarchischen Policy-Strukturen zu lösen. Dieser Vortrag stellt die Methode und ihre neuronalen Grundlagen vor.

Support Vector Machines -- Eine Einführung

Stefan Koch

Wed, 22. 6. 2011, Room 1/B309a

Zur Entwicklung künstlicher intelligenter Systeme bedarf es leistungsfähiger Klassifikatoren. In den 90er Jahren wurden durch Vladimir Vapnik und Alexey Chervonenkis Untersuchungen der statistischen Eigenschaften von Lernalgorithmen vorgestellt. Basierend auf deren Arbeit wurden die Support Vector Machines (SVMs) entwickelt. Diese stellen eine neue Generation von statistischen Klassifikatoren dar, welche den Anspruch erheben eine hohe Leistungsfähigkeit bei realen Anwendungen zu besitzen. Aufgrund der einfachen Anwendbarkeit auf viele Problemstellungen erfreuen sie sich in den letzten Jahren immer größerer Beliebtheit. Im Rahmen des Forschungsseminars wird eine Einführung in die Thematik der Support Vector Machines gegeben.

Die Suche nach Informationen im Gedächtnis resultiert in Blickbewegungen an den Ort der Informationsaufnahme.

Agnes Scholz

Wed, 15. 6. 2011, Room 1/B309a

Beim Abruf von Informationen aus dem Gedächtnis blicken Personen an den Ort der Informationsaufnahme zurück, selbst wenn die gesuchte Information dort nicht mehr vorhanden ist (Ferreira, Apel & Henderson, 2008, Trends in Cognitive Science, 12(11), 405). Zwei Experimente untersuchten dieses Blickphänomen beim Erinnern zuvor gehörter Eigenschaftsausprägungen fiktiver Objekte und beim Hypothesen testen. Durch das Verfolgen von Blickbewegungen beim Erinnern und beim diagnostischen Schließen ist es möglich gedächtnisbasierte Prozesse der Informationssuche zu beobachten. Die Funktion visuell räumlicher Aufmerksamkeitprozesse für die Erklärung dieser Befunde wird diskutiert.

Objekterkennung mittels der generalisierten Hough-Transformation basierend auf der parallelen Hough-Transformation

Abbas Al Ali

Wed, 8. 6. 2011, Room 1/B309a

Die HT (Hough-Transformation) ist ein weit verbreitetes Verfahren im Bereich der Objekterkennung, welche zum Detektieren parametrischer Kurven in digitalen Bildern dient. Die GHT (Generalisierte HT) ist eine von vielen Variationen der HT, die zum Detektieren beliebiger Kurven eingesetzt wird, wobei Gradientinformationen der Kanten in einem Graustufenbild für das Vorhandensein einer Kurve votieren. Die neulich am Fraunhofer-Instituts für Digitale Medientechnologie IDMT entwickelte sogenannte PHT (Parallele HT) ist ein Echtzeitsystem, das vorbestimmte Muster, wie Geradenstücke oder Kreisbogen, lokal in einem Bild detektiert. In dem Vortrag wird die PHT für Geradenstücke kurz erläutert und eine GHT-Implementierung mit Verwendung des PHT-Systems als Feature-Extraktor zur Gewinnung von Shape-Modellen in Form von R-Tabellen (Referenztabellen) und zur Objekterkennung präsentiert. Tests an synthetisierten Bildern zeigen, dass ideale Klassifikationseigenschaften (eine Erkennungsrate von 100% mit einer Falsch-Positiv-Rate von 0%) erreichbar sind.

Learning invariance from natural images inspired by observations in the primary visual cortex

Michael Teichmann

Tue, 26. 4. 2011, Room 1/336

The human visual system has the remarkable ability to recognize objects invariant of their position, rotation and scale. In part, this is likely achieved from early to late areas of visual perception. For the problem of learning invariances, a set of Hebbian learning rules based on calcium dynamics and homeostatic regulations of single neurons is presented. The performance of the learning rules is verified within a model of the primary visual cortex to learn so called complex cells, based on a sequence of static images. As result the learned complex cells responses are largely invariant to phase and position.

The contribution of basal-ganglia to working memory and action selection

Fred Hamker

Tue, 19. 4. 2011, 4/203 (Wilhelm-Raabe-Str.)

Forschungskollogium Psychologie - Seminar: Aktuelle Themen der Kognitionswissenschaft

Entwicklung einer Schnittstelle zur Simulation eines kognitiven Agenten mit aktiver Umweltinteraktion in einer Virtuellen Realität

Xiaomin Ye

Mon, 17. 1. 2011, Room 1/336

Um Interaktives Lernen im Gebiet Computational Neuroscience durchzuführen ist es eine Möglichkeit den menschliche Körper und sein Gehirn durch einen Agenten zu repräsentiert und die umgebenden Welt des Agenten durch eine Virtual Reality(VR) Umgebung. Die vorliegende Diplomverteidigung wird eine Basisimplementierung für diesen Forschungsansatz mit der VR Umgebung Unity3D präsentieren. Der Agent wird über über folgende Sensoren verfügen: zwei Augen (zwei virtuelle Kameras) und Haut (Kollisionssensoren). Ebenso wird er einfache Aktionen wie Laufen, ein Objekt ergreifen und Augen/Kopfbewegungen ausführen können.
Die Arbeit fokusiert sich auf die Fähigkeiten von Unity3D und die Programmierung der Schnittstellen zwischen VR-Umgebung und Agent. Zusätzlich wird die Arbeit an einem Reinforcement-Learning-Agenten demonstriert werden, welcher den Weg durch ein Labyrinth finden kann.

Das kognitiv-emotionale Modell MOTIVATOR

Martina Truschzinski

Wed, 12. 1. 2011, Room 1/336

Das kognitiv-emotionalen Modell MOTIVATOR, welches auf neuroanatomischen und -physiologischen Erkenntnissen entwickelt wurde und innerhalb der Diplomarbeit an neue Erkenntnisse angepasst werden soll, wird vorgestellt. Die Entwicklung, basierend auf neuronalen Netzen, stellt eine Interaktion zwischen Kognition, Motivation und Emotion bereit. Anpassungen und Neuerungen am MOTIVATOR-Modell werden vorgestellt und diskutiert.

Kognitive Architekturen - Eine Einführung in ACT-R

Diana Rösler

Wed, 5. 1. 2011, Room 1/336

Kognitive Architekturen werden entwickelt, um einen Beitrag zum Verständnis menschlicher Kognitionen zu leisten. Basis für die Entwicklung dieser Modellierungsansätze bilden häufig kognitionspsychologische Theorien. Im Vortrag wird ACT-R (Adaptive Control of Thought - Rational) - eine Umsetzungsform von kognitiven Architekturen - vorgestellt. Aktuelle Erweiterungen von ACT-R (Anderson, et al. 2004) stehen dabei im Mittelpunkt.

Invariant Object Recognition

Norbert Freier

Wed, 15. 12. 2010, Room 1/336

The human brain performs well on the task of object recognition. In the past, serveral models have tried to explain how the brain does this task. But up to now any system is outperformed by nature and the operations of the visual stream are not fully understanded. As Examples two models will be examined and compared to other aproaches.

A detailed examination at a Model with Calcium driven homeostatic dynamics with metaplasticity for visual learning

Jan Wiltschut

Wed, 1. 12. 2010, Room 1/068 (not 1/336!)

The Calcium model is based on the following electrophysiological findings: 1) Learning plasticity, that means the Long-Term-Potentiation (LTP) and Long-Term-Depression (LTD) characteristics of Calcium based synaptic learning; 2) Self-regulating mechanisms (scaling, constraint and redistribution); 3) Maintenance and consolidation of learned connections dependent on the strength of synaptic change.
The model is introduced and network characteristics are compared to electrophysiological data. Learning results are shown and will be discussed as well.

Schnelle Objekterkennung mittels CUDA und OpenCL

Helge Dinkelbach und Tom Uhlmann

Wed, 24. 11. 2010, Room 1/336

Sehr viele Algorithmen im Gebiet der Künstlichen Intelligenz haben eine hohe Laufzeit, sind aber gleichzeitig gut parallelisierbar. Moderne Grafikkarten verfügen über eine sehr hohe Anzahl an Rechenkernen, die für die Ausführung paralleler Programme zur Verfügung stehen. Im Rahmen eines Forschungsseminars wurden die zwei Frameworks, die aktuell zur Verfügung stehen (OpenCL und CUDA), für einen Beispielalgorithmus (aus dem Gebiet Objekterkennung/Computational Neuroscience) genutzt und die Ergebnisse verglichen.

A computational model of predictive remapping and visual stability in area LIP

Arnold Ziesche

Wed, 3. 11. 2010, Room 1/336

Cells in many visual areas are retinotopically organized, i.e. their receptive fields (RFs) are fixed on the retina and thus shift when the eye moves. Hence, their input changes with each eye movement, posing the question of how we construct our subjective experience of a stable world. It has been proposed that predictive remapping could provide a potential solution. Predictive remapping refers to the observation that for some neurons retinotopically organized RFs anticipate the eye movement and become responsive to stimuli which are presented in their future receptive field (FRF) already prior to the saccadic eye movement. Here I show that predictive remapping emerges within a computational model of coordinate transformation in LIP. The model suggests that predicitive remapping originates in the basis function layer from the combined feedback from higher, head-centered layers interacting with the corollary discharge signal. Furthermore it predicts a new experimental paradigm to find predicitive remapping cells.

Unifying working memory and motor control.

Henning Schroll

Wed, 27. 10. 2010, Room 1/336

Basal ganglia have been shown to substantially contribute to both working memory and motor control. In my talk I will present a computational model that unifies both of these functions: Within an anatomically inspired architecture of parallel and hierachically interconnected cortico-basal ganglia-thalamic loops, the model learns both to flexibly control working memory and to decide for appropriate responses based on working memory content and visual stimulation. The model's success in learning complex working memory tasks underlines the power and flexibility of the basic Hebbian and three-factor learning rules used

A model for learning color selective receptive cells from natural scenes

Martina Truschzinski

Wed, 20. 10. 2010, Room 1/336

Different from the standard view of color perception that proposes largely different pathways for color and shape perception, recently it has been discovered that in the primary visual cortex, color and shape are not processed apart from one another. Electrophysiological studies suggest that cells do not only respond to stimuli of a certain orientation or shape but at the same time they can be color selective. Different receptive field types have been reported: color-responsive single-opponent cells, color-responsive double-opponent cells (circular and orientated) and non-color-responsive cells. We here show that such receptive fields can emerge from Hebbian learning when presenting colored natural scenes to a model of V1 that has previously been proven to learn "edge-detecting" receptive fields (RFs) out of gray-scale images similar to those of primary visual cortex of macaque monkey.

Learning invariance in object recognition inspired by observations in the primary visual cortex of primates (Diplomverteidigung)

Michael Teichmann

Wed, 29. 9. 2010, Room 1/336

The human visual system has the remarkable ability to recognize objects invariant of their position, rotation and scale. A better interpretation of neurobiological findings involves a computational model that is capable of simulating signal processing in the visual cortex. Basically, to solve the task to create a computational model for invariant object recognition, an algorithm for learning invariance is required. There are only few studies at hand that cover the issue of learning such invariance.
In this thesis a set of Hebbian learning rules based on the calcium dynamics and homeostatic regulations of single neurons are proposed. These rules implement dynamic traces of activity that allow to learn spatially invariant representations from temporal correlations. They are applied within a simple model of the primary visual cortex of primates and simulate the unsupervised learning of receptive fields from natural scenes.
Furthermore, the discrimination capability of the model is demonstrated regarding simple artificial input and more difficult natural scenes. The properties of network neurons are also compared to properties of V1 complex cells. As a result of the thesis this approach shows that it is possible to create a limited network, which can learn an invariant representation of natural objects in a biologically comparable way.

Performance Gain for Clustering with Growing Neural Gas Using Parallelization Methods

Alexander Adam

Tue, 13. 7. 2010, Room 1/336

The amount of data in databases is increasing steadily. Clustering this data is one of the common tasks in Knowledge Discovery in Databases (KDD). For KDD purposes, this means that many algorithms need so much time, that they become practically unusable. To counteract this development, we try parallelization techniques on that clustering.
Recently, new parallel architectures have become affordable to the common user. We investigated especially the GPU (Graphics Processing Unit) and multi-core CPU architectures. These incorporate a huge amount of computing units paired with low latencies and huge bandwidths between them.
In this paper we present the results of different parallelization approaches to the GNG clustering algorithm. This algorithm is beneficial as it is an unsupervised learning method and chooses the number of neurons needed to represent the clusters on its own.

Methods of face detection and recognition

Arash Kermani

Tue, 6. 7. 2010, Room 1/336

First, we will present an overview of existing methods of face detection and recognition. Second, we will discuss the possibility of using biologically plausible vision models for face recognition.

Generierung von Merkmalen für die fehlerklassifizierende Prozesskontrolle von Laserschweißungen unter Anwendung von Sparse Coding- und ICA-Ansätzen(Diplomarbeit)

Thomas Wiener

Tue, 29. 6. 2010, Room 1/336

Die automatisierte Qualitätskontrolle von Laserschweißungen ist ein aktuelles Forschungsthema. Aufgrund der hohen Komplexität von Laserschweißprozessen existiert bisher kein vollständiges, quantitatives Modell, welches die Zusammenhänge zwischen Prozessparametern, Sensorausgaben und Schweißresultaten beschreibt. Eine vielversprechende Methode für die Qualitätsbewertung von Laserschweißungen ist die Verwendung überwachter Lernverfahren, welche mit Hilfe manuell selektierter Merkmale aus den Sensordaten angelernt werden. In Hinblick auf die Automatisierbarkeit der Qualitätskontrolle kann sich diese Vorgehensweise jedoch nachteilhaft auswirken, da die manuelle Merkmalsselektion Expertenwissen erfordert. In der vorliegenden Arbeit werden sowohl ICA- und Sparse Coding-Ansätze als auch das klassische Verfahren der Hauptkomponentenanalyse für die automatische Generierung von Merkmalen angewandt. Somit entsteht ein System, welches den Arbeitsschritt der manuellen Merkmalsselektion ersetzt. Anhand der im Rahmen dieser Arbeit zur Verfügung stehenden Sensordaten konnte gezeigt werden, dass die Ersetzung der manuellen Merkmalsselektion durch automatisch generierte Merkmale zu einer Verbesserung der Bewertungszuverlässigkeit beiträgt.

Retino-centric vs. ego-centric reference frame - the double-flash experiment

Arnold Ziesche

Tue, 8. 6. 2010, Room 1/336

The spatial localization of visual stimuli by the brain takes place in different reference frames, such as retinal or head-centered coordinate systems. This talk discusses how theoretical models which try to understand how these reference frames interact can benefit from the so-called double-flash experiment where two successive stimuli are shown around the time of eye movements.

The guidance of vision while learning categories - A computational model of the Basal Ganglia and Reinforcement Learning

Robert Blank

Tue, 25. 5. 2010, Room 1/336

Human beings are able to learn categories quite fast. But how does this happen and which role does guidance of visual perception play in this case? We will present a computational model of the Basal Ganglia which is based on a Reinforcement Learning algorithm. The further work will be an adaption of this proposed model to experimental data from humans to validate it

Synaptic Learning: Induction, maintenance and consolidation of synaptic connections.

Jan Wiltschut

Tue, 4. 5. 2010, Room 1/336

Changes in the connection strength between neurons in response to appropriate stimulation are thought to be the physiological basis for learning and memory formation. Long-Term-Potentiation (LTP) and Long-Term-Depression (LTD) of synapses in cortical areas have been the focus in the research of acquisition and storing new information. ... Read more

Perisaccadic shift in complete darkness. A computational model.

Arnold Ziesche

Tue, 27. 4. 2010, Room 1/336

In order to localize visually perceived objects in space the visual system has to take into account the gaze direction. Under normal circumstances this works well and the stimulus position information which in the beginning of the visual path is represented in a coordinate system which is centered on the retina and thus moves with the eyes is transformed into a space representation which is independent of the eye position. However, in complete darkness when there are no reference stimuli available, briefly flashed stimuli around the time of a saccade are systematically misperceived in space. Here I present backgrounds to this misperception and the approaches to explain it. The focus will be on our own computational model.

Learning disparity and feature selective cells in primary vision

Mark-Andre Voss

Tue, 20. 4. 2010, Room 1/336

Depth perception is an important cue used by human vision for many applications. Many details how the brain extracts depth information through binocular vision are still unknown. Most of the existing models use genericly constructed simple V1 cells to model disparity sensitivity. We have developed an unsupervised learning approach using Hebbian and anti-Hebbian learning principles and nonlinear dynamics to learn disparity- and feature-selective cells from a set of stereo images resulting in cells similar to cells found in V1. An introduction to binocular vision and an overview of our ongoing research are being presented.

Object recognition - VVCA (Vergence-Version Control with Attention Effects)

Frederik Beuth

Tue, 13. 4. 2010, Room 1/336

We will present a minimalistic, but complete robotic approach to detect objects in a scene. It uses stereoscopic input pictures of two cameras to detect stereoscopic edges. The system contains two control loops to archive vergence and version. The object recognition uses the edge detection information as inputs and is able to recognize and locate simple objects in the scene. The system is developed as part of the European project Eyeshots

Entwurf und Implementierung einer Schnittstelle für eine verteilte Simulationsumgebung für mobile autonome Systeme

Sebastian Drews

Thu, 28. 1. 2010, Room 2/209

Beim Entwurf und der Verifikation von Algorithmen für einzelne oder mehrere miteinander kooperierende autonome Systeme spielen Simulationsumgebungen eine immer wichtigere Rolle. Die Robotersimulation USARSim stellt einen viel versprechenden Ansatz für eine derartige Simulationsumgebung dar. In der vorliegenden Arbeit werden die Möglichkeiten des Einsatzes von USARSim für die Simulation von Quadrocoptern vorgestellt und bewertet. Es wird eine parallelisierte, verteilte Steuerung der Robotersimulation sowie eine darauf aufbauende Schnittstelle vorgestellt. Diese Schnittstelle zwischen der Simulation und den bereits vorhandenen Softwaremodulen ermöglicht das Ausführen von Anwendungen sowohl in der Simulation als auch auf der realen Hardware, ohne dass dazu Anpassungen am Programmcode erfolgen müssen. Weiterhin wird die Implementierung einer Server-Applikation zum Auslesen der Bilder der simulierten Kameras vorgestellt.

Eye Vergence Achieved without Explicitly Computed Disparity

Nikolay Chumerin (University K.U.Leuven)

Wed, 6. 1. 2010, Room 1/336

Vergence control is still very important topic in robotics and it uses intensively findings from neuroscience, psychophysics, control theory and computer vision. Most of the existing models try to minimize horizontal disparity in order to achieve proper vergence. We propose biologically-inspired approach which does not rely on explicitly computed disparity, but extract the desired vergence angle from the postprocessed response of a population of disparity tuned complex cells, the actual gaze direction and the actual vergence angle. The evaluation of two simple neural vergence angle control models are also discussed.

Combining biological models of attention and navigation with image processing tools for simultaneous localization and mapping (Diplomverteidigung)

Peer Neubert

Tue, 15. 12. 2009, Room 2/209 (Robotik-Lab)

Simultaneous localization and mapping (SLAM) is an essential capability for any autonomous arti?cial or biological system acting mobile in an unknown environment. This present work is about using image regions as landmarks for SLAM, that are likely to get attentional focus of an human observer. Input driven bottom-up processes play an considerable role for visual attention and well understood models for these processes exist. Content of this present work is analysis of these models, implementation and adaption to use for SLAM. Therefore, interesting images regions has to be extracted, tracked over an image sequence and matched when a prior observed scene becomes visible again. For low computational demands, the psychophysical models have to be implemented with efficient image processing tools.
Very recently M. Milford and G. Wyeth presented a biologically inspired approach to SLAM, called RatSLAM. They show promising results on mapping a complete suburb with a camera mounted on the roof of car. While their navigation system is biologically plausible, the used vision system lacks this character. In this present work, the origin visual system is replaced by the attention based visual features. Promising results of the combined system on real world outdoor data are presented.

Feature generation for the classification of laser welding data applying sparse coding and ICA approaches

Thomas Wiener

Tue, 8. 12. 2009, Room 1/336

Quality control of laser weldings is an important field of study, since the requirements on product quality increase continuously. Due to the complexity of laser welding processes, no complete quantitative models are existing to describe the causal relationships between process parameters and sensor outputs. A promising evaluation method for the quality of weld-seams is the use of supervised learning algorithms trained with manual selected features from pre-/ in-/ and postprocess sensor data.

The subject of this presentation will be a concept of replacing the manual feature selection by an automated feature generation using independent component analysis (ICA) and sparse coding approaches. Also a short introduction to the principles of laser welding, suitable sensors and classification through multiple linear

Applying a Calcium-Dependent Learning Rule for the Unsupervised Adaptation of Neuronal-Weights in the Primary Visual Cortex. (Diplomarbeit-Konzeptvortrag)

Michael Teichmann

Tue, 17. 11. 2009, Room 1/336

We will give a short introduction to the biological background of the visual stream, especially V1, followed by foundations of the learning methods and previous models. Finally, we will focus on the concept for the new model.