`

1.1. Time and Place

The defense session in November will take place on December 20, 2024, Friday at 10:00 AM. You can join the Zoom meeting room using the information below. 


Join Zoom Meeting

https://tum-conf.zoom-x.de/j/61786165009?pwd=Zkg0R2VvbWJHd28vbmJCa2RHdkZEQT09

Passcode: i6defense

Hosted by Kejia Chen (kejia.chen@tum.de)

1.2. Schedule

10:00 - 10:20 Manuel Tamayo Moreno (BA Thesis)

Title: Using Environmental Context and Successor Representation to Enhance Biologically Inspired Navigation

Advisor: Kejia Chen

Keywords:

Abstract: Recent neurological findings have helped us better understand how mammals, especially rats, locate themselves and navigate environments. The discovery of place cells, which encode discrete locations in the environment, grid cells, which encode distances for the animal to get an inner coordinate system, and boundary vector cells, which fire at the limits of an environment, are essential for building cognitive maps. This environment representation allows animals to perform high-level topological navigation, including path planning and lower-level vector navigation. We propose a new component to establish the reachability between points in the environment, a process crucial in forming cognitive maps, by leveraging the available information from the environment, emulating the visual cues in natural processes that enable visual odometry. Furthermore, we implement an Obstacle Avoidance mechanism that uses boundary cells’ firing activity to navigate paths in cluttered environments while maintaining the original goal direction. These new components are then compared to the baseline methods to showcase their adaptability and robustness; further changes to different elements of the framework were also included to address the challenges arising from switching from navigation in very dense cognitive maps to sparser ones, aiming for a flexible utilization of the environment’s particularities.

10:20 - 10:40 Yiran Li (BA Thesis)

Title: Global motion planning based on neural cost map

Advisor: Alexander Lenz

Keywords:

Abstract: In this thesis, we investigate machine learning based approaches for the smooth and collision free global planning in autonomous driving with long planning horizon. Building upon the baseline using interpretable end-to-end interpretable neural net works, we propose several engineering enhancements and developed two pipelines for learning the map patches’  topology implicitly and explicitly. 

10:40 - 11:05 Mengyu Li (MA Thesis)

Title: Multi-Modal Fusion of Image Sequences for Dense Prediction with RGB and Event Cameras in Autonomous Driving

Advisor: Hu Cao

Keywords:

Abstract: Integrating RGB and event camera data through multi-modal fusion in autonomous driving significantly enhances dense prediction tasks such as depth estimation and object detection. RGB cameras provide high-resolution color imagery crucial for visual perception. In contrast, event cameras offer high temporal resolution and dynamic range, capturing pixel-level changes caused by motion even in challenging lighting conditions. The system can construct a more comprehensive and robust representation of the environment by fusing the continuous visual stream of RGB with the asynchronous intensity changes captured by event cameras. This thesis mainly focuses on combining the multi-modal features for semantic segmentation. This thesis makes the following contributions: 1.) A literature review of multi-modal fusion, confined to RGB-Events and depth and thermal data. 2.) Creating an innovative pipeline by adapting pre-trained segmentation model to get higher semantic segmentation performance. 3) Evaluate several functional blocks to reveal the next developing and key principle. Our research finds that: 1) Interactive level fusion shows outstanding performance compared to input level fusion and feature level fusion. 2) Introduce mutual information loss may obtain a higher score, but how to fine-tuning the loss function will be a big challenge.

11:05 - 11:30 Siming Sun (MA Thesis)

Title: Rapid Surface Reconstruction Based on Bayesian Meta-Learning

Advisor: Yu Zhang

Keywords:

Abstract: With the widespread adoption of laser scanning technology, surface reconstruction from point cloud data has emerged as a critical research area in the field of computer vision and computational geometry. Among the various approaches, \ac{GPIS} is one of the most classical algorithms that generates a global implicit surface model by constructing mathematical functions. Extended models based on \ac{GPIS}, such as e\ac{GPIS} \cite{murvanidze2022enhanced}, further enable feature extraction and more precise surface reconstruction, meeting the perceptual demands of advanced tasks like robotic grasping. However, \ac{GPIS} also has two clear limitations. First, the entire process is conducted online, resulting in high computational cost and power consumption, which makes it impossible for real-time applications such as obstacle avoidance in autonomous driving. Second, kernel functions and hyperparameters in \ac{GP} are manually selected. Unfortunately, these choices are always complex and unintuitive.

To address these challenges, we proposed a new model, \ac{MetaSDF}, which leverages Bayesian linear regression and neural networks to achieve precise surface reconstruction comparable to e\ac{GPIS}. By incorporating meta-learning, the model extracts and learns all prior information during the offline phase, enabling it to predict posterior density quickly and accurately during the online phase. Compared to the traditional methods, \ac{MetaSDF} not only enhances reconstruction efficiency but also exhibits greater robustness in complex scenarios. The meta-learning method allows the model to identify shared features from diverse datasets, significantly improving its generalization capability under limited data conditions. Additionally, by adjusting the balance between accuracy and computation time, \ac{MetaSDF} demonstrates the potential for real-time applications such as obstacles avoidance in autonomous driving.

11:30 - 11:50 Jiong Liu (FP Presentations)

Title: General Visual Model for Arbitrary-modal Semantic Segmentation

Advisor: Hu Cao

Keywords:

Semantic segmentation across arbitrary sensor modalities faces significant challenges due to diverse sensor characteristics and the traditional configurations for this task results in redundant development efforts. We address these challenges by introducing a universal arbitrary-modal semantic segmentation framework that unifies segmentation across multiple modalities. Our approach features three key innovations: (1) Modality-aware CLIP (MACLIP), which provides modality-specific scene understanding guidance through LoRA finetuning; (2) complementary learnable prompts for capturing fine-grained features; and (3) a Modality-aware Selective Adapter (MASA) for dynamic feature adjustment. Evaluated on five diverse datasets with different complementary modalities (event, thermal, depth, polarization, and light field), our framework achieves state-of-the-art performance with a mean IoU of 65.03, surpassing both specialized multi-modal methods and medical universal segmentation approaches.

The I6 defense day is held monthly, usually on the last Friday of each month. The standard formats of talks are:

Type Time of Presentation Time for questions & answers
Initial topic presentation 5 min 5 min
BA thesis 15 min 5 min
Guided Research 10 min 5 min
Interdisciplinary Project 15 min 5 min
MA thesis 20 min 5 min

More information on preparing presentations can be found in the Thesis Submission Guidelines.

  • Keine Stichwörter