Explainable Reinforcement Learning in Flight Control through Reward Decomposition

Master thesis (2022)

Authors

André Lemos Aerospace Engineering

Contributors

E. van Kampen (mentor)

C.C. de Visser (graduation committee member)

M.C. Naeije Astrodynamics & Space Missions - Aerospace Engineering (graduation committee member)

Faculty

Aerospace Engineering, Aerospace Engineering

To reference this document use:

http://resolver.tudelft.nl/uuid:dd7325fa-c615-4df5-8ce3-e3d6ef9cb33c

More Info

expand_more

Published Date

23-09-2022

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Aerospace Engineering

Abstract

Even though Deep Reinforcement Learning (DRL) techniques have proven their ability to solve highly complex control tasks, the opaqueness and inexplicability associated with these solutions many times stops them from being applied to real flight control applications. In this research, reward decomposition explanations are used to tackle this issue and augment DRL end-user explainability. A reward decomposition-based DRL controller is deployed in a longitudinal state-space model of the Cessna Citation 500 aircraft, and it is assessed on two attitude flight control tasks. Furthermore, a new explanation type called Dominant Reward eXplanations (DRX) is presented, which allows users to obtain more global insights than the ones generated by Reward Difference eXplanations (RDX). Results show that the explanations produced lead to straightforward and intuitive insights about the controller’s behaviour, capable of improving end-user explainability. Moreover, a small analysis seems to indicate that the decomposed method has similar performance to the one obtained without reward decomposition, however, training time increases considerably. To the author’s best knowledge, this is the first application of reward decomposition explanations to the flight control domain.

Files

MSc_Thesis_Andre_Lemos.pdf

(pdf | 7.89 Mb)

Unknown license