Conventional reinforcement learning (RL) models of variable speed limit (VSL) control systems (and traffic control systems in general) cannot be trained in real traffic process because new control actions are usually explored randomly, which may result in high costs (delays) due
...
Conventional reinforcement learning (RL) models of variable speed limit (VSL) control systems (and traffic control systems in general) cannot be trained in real traffic process because new control actions are usually explored randomly, which may result in high costs (delays) due to exploration and learning. For this reason, existing RL-based VSL control approaches need a traffic simulator for training. However, the performance of those approaches are dependent on the accuracy of the simulators. This paper proposes a new RL-based VSL control approach to overcome the aforementioned problems. The proposed VSL control approach is designed to improve traffic efficiency by using VSLs against freeway jam waves. It applies an iterative training framework, where the optimal control policy is updated by exploring new control actions both online and offline in each iteration. The explored control actions are evaluated in real traffic process, thus it avoids that the RL model learns only from a traffic simulator. The proposed VSL control approach is tested using a macroscopic traffic simulation model to represent real world traffic flow dynamics. By comparing with existing VSL control approaches, the proposed approach is demonstrated to have advantages in the following two aspects: (i) it alleviates the impact of model mismatch, which occurs in both model-based VSL control approaches and existing RL-based VSL control approaches, via replacing knowledge from the models by knowledge from the real process, and (ii) it significantly reduces the exploration and learning costs compared to existing RL-based VSL control approaches.
@en