A Comprehensive Study of Dynamic SLAM

From Realistic Dynamic Environment Simulation Towards Robust Visual Localization

Master thesis (2023)

Authors

Chenghao Xu Mechanical Engineering

Contributors

J. Alonso-Mora Learning & Autonomous Control - Mechanical, Maritime and Materials Engineering (mentor)

Javier Alonso-Mora Learning & Autonomous Control - Mechanical, Maritime and Materials Engineering (mentor)

E. Bonetto Max Planck Institute for Intelligent Systems (mentor)

A Ahmad University of Stuttgart (mentor)

M. Wisse Robot Dynamics - Mechanical, Maritime and Materials Engineering (coach)

M. Kok Team Manon Kok - Mechanical, Maritime and Materials Engineering (coach)

Manon Kok Team Manon Kok - Mechanical, Maritime and Materials Engineering (coach)

Faculty

Mechanical Engineering, Mechanical Engineering

To reference this document use:

http://resolver.tudelft.nl/uuid:67115455-a15f-44df-aa7c-96b70c3bfdef

More Info

expand_more

Published Date

08-11-2023

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Mechanical Engineering

Abstract

In recent years, visual Simultaneous Localization and Mapping (SLAM) have gained significant attention and found wide-ranging applications in diverse scenarios. Recent advances in computer vision and deep learning also enrich visual SLAM capabilities in scene understanding and large-scale operation. However, despite remarkable performance in these fields, most visual SLAM frameworks are designed with the static world assumption. Thus, they often confront challenges in dynamic environments, manifesting reduced localization accuracy, tracking failures, and restricted generalization.

To investigate the impact of moving objects in dynamic indoor environments, we first benchmark representative visual (dynamic) SLAM approaches, complemented by robustness assessments for preliminary insights. During this process, we adopt challenging sequences from GRADE, an ideal platform for simulating dynamic indoor scenes. Notably, the mainstream of dynamic SLAM methods employs detection or segmentation techniques as solutions. To explore the correlation between detector accuracy and overall SLAM performance, we integrate a series of trained YOLOv5 and Mask R-CNN models, each with varying accuracy levels, into dynamic SLAM systems. Subsequently, we evaluate these configurations on the TUM RGB-D sequences. Contrary to common intuition, the experiments indicate that more accurate object detectors do not necessarily lead to improved visual SLAM performance. This benchmarking process also illuminates several inherent limitations of current dynamic SLAM techniques, underscoring the imperative for further advancements.

Building upon these insights, we introduce DynaPix SLAM, an innovative visual SLAM system for dynamic indoor environments, where participation of visual cues (e.g., features) is weighted based on per-pixel motion probability values. Our approach consists of a semantic-free pixel-wise motion estimation module and an improved pose optimization process. In the first stage, our motion probability estimator employs a novel static background differencing method on both images and optical flows to identify moving regions. These probabilities are then incorporated into the map point selection and weighted bundle adjustment for backend optimization. We evaluate our DynaPix SLAM and its variant, DynaPix-D, in comparison with ORB-SLAM2 and DynaSLAM. These assessments are performed on both TUM RGB-D and GRADE sequences, with additional tests on the static versions of the GRADE ones. The results demonstrate that DynaPix SLAM consistently outperforms the other methods, showcasing reduced localization errors and longer tracking durations across various scenarios.

Files

Thesis_Chenghao_Xu.pdf

Unknown license

File under embargo until 15-11-2024