Automatic learning of cyclist's compliance for speed advice at intersections - a reinforcement learning-based approach

Conference paper (2019)

Authors

A. Dabiri ,

Azita Dabiri ,

A. Hegyi

S.P. Hoogendoorn Transport and Planning

To reference this document use:

http://resolver.tudelft.nl/uuid:d7739dcf-4ea2-489b-b386-2e064433b186

More Info

expand_more

Published Date

2019

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Although there exists algorithms that give speed advice for cyclists when approaching traffic lights with uncertainty in the timing, they all need to know, and thus assume, the cyclist's response to the advice in order to be able to optimize the advice. To relax this assumption, in this paper an algorithm is proposed that combines reinforcement learning and planning to learn the reaction of cyclist to the advice and deploys this information for planning the best next advice on-the-fly. Rather than a single search procedure, which is conventional in the existing architectures, two sample-based search procedures are suggested to be used in the algorithm. This makes it possible to obtain an accurate local approximation of the action-value function, in spite of the short computation time that is available in each decision epoch. The algorithm is tested in a simulation case study where the impact of a proper initialisation of action-value function as well as the importance of using two search procedures are affirmed.

Files

08916847.pdf

(pdf | 0.24 Mb)

Unknown license

Download not available