Attention-Aware Age-Agnostic Visual Place Recognition

Master thesis (2019)

Authors

J. Li Electrical Engineering, Mathematics and Computer Science

Contributors

J.C. van Gemert Pattern Recognition and Bioinformatics - (mentor)

S. Khademi Pattern Recognition and Bioinformatics - (mentor)

Z. Wang Pattern Recognition and Bioinformatics - (mentor)

M.J.T. Reinders Pattern Recognition and Bioinformatics - (graduation committee member)

M. J.T. Reinders Pattern Recognition and Bioinformatics - (graduation committee member)

Marcel J.T. Reinders Pattern Recognition and Bioinformatics - (graduation committee member)

Marcel Reinders Pattern Recognition and Bioinformatics - (graduation committee member)

L. Nan Urban Data Science - Architecture and the Built Environment (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

Attention Mechanism Computer Vision Domain Adaptation Image Matching

To reference this document use:

http://resolver.tudelft.nl/uuid:250d37a9-bc0d-4f8f-8d1a-d31a98dc22d7

More Info

expand_more

Published Date

28-08-2019

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

A cross-domain visual place recognition (VPR) task is proposed in this work, i.e., matching images of the same architectures depicted in different domains. VPR is commonly treated as an image retrieval task, where a query image from an unknown location is matched with relevant instances from geo-tagged gallery database. Different from conventional VPR settings where the query images and gallery images come from the same domain, we propose a more common but challenging setup where the query images are collected under a new unseen condition. The two domains involved in this work are contemporary street view images of Amsterdam from the Mapillary dataset (source domain) and historical images of the same city from Beeldbank dataset (target domain). We tailored an age-invariant feature learning CNN that can focus on domain invariant objects and learn to match images based on a weakly supervised ranking loss. We propose an attention aggregation module that is robust to domain discrepancy between the train and the test data. Further, a multi-kernel maximum mean discrepancy (MK-MMD) domain adaptation loss is adopted to improve the cross-domain ranking performance. Both attention and adaptation modules are unsupervised while the ranking loss uses weak supervision. Visual inspection shows that the attention module focuses on built forms while the dramatically changing environment are less weighed. Our proposed CNN achieves state of the art results (99% accuracy) on the single-domain VPR task and 20\% accuracy at its best on the cross-domain VPR task, revealing the difficulty of age-invariant VPR.

Files

Thesis.pdf

(pdf | 8.98 Mb)

Unknown license