Q-value reuse between state abstractions for traffic light control

Bachelor thesis (2020)

Authors

E.F.M. Kuhn Electrical Engineering, Mathematics and Computer Science

Contributors

J. He Interactive Intelligence - (mentor)

R.A.N. Starre Interactive Intelligence - (mentor)

F.A. Oliehoek Interactive Intelligence - (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

Traffic abstractions DQN Q-value Reuse

To reference this document use:

http://resolver.tudelft.nl/uuid:aea7f1d8-cd87-4d12-ba43-9bf8ca7c4479

More Info

expand_more

Published Date

22-06-2020

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Previous research has in reinforcement learning for traffic control has used various state abstractions. Some use feature vectors while others use matrices of car positions. This paper first compares a simple feature vector consisting of only queue sizes per incoming lane to a matrix of car positions. Then it investigates if knowledge can be transferred from a simple agent using the feature vector abstraction to a more complex agent that uses the position matrix abstraction.We find that training cannot be sped up by first training an agent with the feature vector abstraction and then reusing this Q-function to train an agent with the position matrix abstraction. The simple agent does not take considerably fewer samples to converge, and the total time needed to first train the simple agent and then transfer exceeds the time needed to train the complex agent from scratch.

Files

Q_value_reuse_between_state_ab... (pdf)

(pdf | 1.6 Mb)

Unknown license