Q. Yang | TU Delft Repository

General Optimal Trajectory Planning

Enabling Autonomous Vehicles with the Principle of Least Action

Journal article (2024) - H. Huang (author) , H. Huang (author) , Yicong Liu (author) , Jinxin Liu (author) , Q. Yang (author) , Q. Yang (author) , Jianqiang Wang (author) , David Abbink (author) , David Abbink (author) , A. Zgonnikov (author)

This study presents a general optimal trajectory planning (GOTP) framework for autonomous vehicles (AVs) that can effectively avoid obstacles and guide AVs to complete driving tasks safely and efficiently. Firstly, we employ the fifth-order Bezier curve to generate and smooth the ...

Risk Aversion and Guided Exploration in Safety-Constrained Reinforcement Learning

Doctoral thesis (2023) - Q. Yang (author)

In traditional reinforcement learning (RL) problems, agents can explore environments to learn optimal policies through trials and errors that are sometimes unsafe. However, unsafe interactions with environments are unacceptable in many safety-critical problems, for instance in ro ...

Subtask-masked curriculum learning for reinforcement learning with application to UAV maneuver decision-making

Journal article (2023) - Yueqi Hou (author) , Xiaolong Liang (author) , Maolong Lv (author) , Q. Yang (author) , Y. Li (author)

Unmanned Aerial Vehicle (UAV) maneuver strategy learning remains a challenge when using Reinforcement Learning (RL) in this sparse reward task. In this paper, we propose Subtask-Masked curriculum learning for RL (SUBMAS-RL), an efficient RL paradigm that implements curriculum lea ...

Reinforcement Learning by Guided Safe Exploration

Conference paper (2023) - Q. Yang (author) , T. D. Simão (author) , Nils Jansen (author) , Simon H. Tindemans (author) , M.T.J. Spaan (author)

Safety is critical to broadening the application of reinforcement learning (RL). Often, we train RL agents in a controlled environment, such as a laboratory, before deploying them in the real world. However, the real-world target task might be unknown prior to deployment. Reward- ...

CEM: Constrained Entropy Maximization for Task-Agnostic Safe Exploration

Conference paper (2023) - Q. Yang (author) , M.T.J. Spaan (author)

Without an assigned task, a suitable intrinsic objective for an agent is to explore the environment efficiently. However, the pursuit of exploration will inevitably bring more safety risks. An under-explored aspect of reinforcement learning is how to achieve safe efficient explor ...

Training and Transferring Safe Policies in Reinforcement Learning

Conference paper (2022) - Q. Yang (author) , T. D. Simão (author) , Nils Jansen (author) , Simon H. Tindemans (author) , M.T.J. Spaan (author)

Safety is critical to broadening the a lication of reinforcement learning (RL). Often, RL agents are trained in a controlled environment, such as a laboratory, before being de loyed in the real world. However, the target reward might be unknown rior to de loyment. Reward-free R ...

A Modern Perspective on Safe Automated Driving for Different Traffic Dynamics using Constrained Reinforcement Learning

Conference paper (2022) - Danial Kamran (author) , T. D. Simão (author) , Q. Yang (author) , C.T. Ponnambalam (author) , Johannes Fischer (author) , M.T.J. Spaan (author) , Martin Lauer (author)

The use of reinforcement learning (RL) in real-world domains often requires extensive effort to ensure safe behavior. While this compromises the autonomy of the system, it might still be too risky to allow a learning agent to freely explore its environment. These strict impositio ...

Refined Risk Management in Safe Reinforcement Learning with a Distributional Safety Critic

Conference paper (2022) - Q. Yang (author) , T. D. Simão (author) , Simon H. Tindemans (author) , M.T.J. Spaan (author)

Safety is critical to broadening the real-world use of reinforcement learning (RL). Modeling the safety aspects using a safety-cost signal separate from the reward is becoming standard practice, since it avoids the problem of finding a good balance between safety and performance. ...

Safety-constrained reinforcement learning with a distributional safety critic

Journal article (2022) - Q. Yang (author) , T. D. Simão (author) , T. D. Simão (author) , Simon H. Tindemans (author) , M.T.J. Spaan (author)

Safety is critical to broadening the real-world use of reinforcement learning. Modeling the safety aspects using a safety-cost signal separate from the reward and bounding the expected safety-cost is becoming standard practice, since it avoids the problem of finding a good balanc ...

WCSAC: Worst-Case Soft Actor Critic for Safety-Constrained Reinforcement Learning

Conference paper (2021) - Q. Yang (author) , T. D. Simão (author) , Simon H. Tindemans (author) , M.T.J. Spaan (author)

Safe exploration is regarded as a key priority area for reinforcement learning research. With separate reward and safety signals, it is natural to cast it as constrained reinforcement learning, where expected long-term costs of policies are constrained. However, it can be hazardo ...